CN117610002B - Multi-mode feature alignment-based lightweight malicious software threat detection method - Google Patents

Multi-mode feature alignment-based lightweight malicious software threat detection method Download PDF

Info

Publication number
CN117610002B
CN117610002B CN202410086383.2A CN202410086383A CN117610002B CN 117610002 B CN117610002 B CN 117610002B CN 202410086383 A CN202410086383 A CN 202410086383A CN 117610002 B CN117610002 B CN 117610002B
Authority
CN
China
Prior art keywords
software
nodes
label
tag
encoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410086383.2A
Other languages
Chinese (zh)
Other versions
CN117610002A (en
Inventor
孙捷
车洵
陈亚当
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Zhongzhiwei Information Technology Co ltd
Original Assignee
Nanjing Zhongzhiwei Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Zhongzhiwei Information Technology Co ltd filed Critical Nanjing Zhongzhiwei Information Technology Co ltd
Priority to CN202410086383.2A priority Critical patent/CN117610002B/en
Publication of CN117610002A publication Critical patent/CN117610002A/en
Application granted granted Critical
Publication of CN117610002B publication Critical patent/CN117610002B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/552Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Computer Hardware Design (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Virology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a lightweight malicious software threat detection method based on multi-mode feature alignment, which comprises the following steps: giving log information and a malicious software tag table of a program sample of software to be detected; analyzing log information of the software to be detected, and preliminarily outputting different probability labels of the software to be detected; introducing a malicious software tag to construct a vocabulary, and obtaining an embedded vector; dividing nodes in the graph into a plurality of clusters; obtaining a classified voting label of the software to be detected through a cluster voting prompt realignment algorithm; establishing a student encoder to obtain a software prediction tag to be detected; for weak labels and predictive labels, calculating the loss of samples by using a real sample label and a maximized boundary method; obtaining a decision result according to the loss of the sample; the method has the characteristics of identifying and detecting the threat of the malicious software, realizing high-efficiency and light threat detection, and reducing the risk of the threat on user data and privacy.

Description

Multi-mode feature alignment-based lightweight malicious software threat detection method
Technical Field
The invention relates to the technical field of network security, in particular to a lightweight malicious software threat detection method based on multi-modal feature alignment.
Background
With the continued development of computer and internet technology, malware threats have become a serious challenge in the field of information security. Malware is now in a variety of forms including, but not limited to, viruses, worms, trojan horses, lux software, advertising software, malicious browser plug-ins, and the like. The variability between these malware types makes detection more complex, posing a serious threat to information assets of individuals, businesses, and government agencies. These threats may cause problems of data leakage, system paralysis, financial loss, personal privacy leakage, and the like. Malware detection requires not only timely discovery of threats, but also rapid response to isolate, clear, or repair infected systems and data. Real-time is critical to limit the spread of threats to reduce potential damage. Thus, malware threat detection is one of the important research directions in the field of information security.
In the current information security environment, the threat forms of malicious software are varied, and attackers continuously adopt new technical means to avoid the traditional detection method. Traditional malware detection methods mainly include feature-based detection and behavior-based detection.
Feature-based detection methods rely on known malware features such as virus signatures or patterns of malicious code. However, this approach is susceptible to malware variants, as an attacker can easily modify the characteristics of malware to evade detection. Furthermore, feature-based detection methods typically require a large library of features, which can lead to wasted storage and computing resources, and are not effective against zero-day threats, i.e., malware that has not been discovered and recorded. Furthermore, malware authors continue to improve tools and techniques to evade detection. This includes techniques of code obfuscation, multi-layer encryption, self-modifying malicious code, etc. These techniques make conventional feature-based detection methods more difficult in identifying and blocking malware.
Behavior-based detection methods attempt to analyze the execution behavior of malware without relying on specific features. Although this approach has certain advantages, it also has some problems. First, behavior-based detection methods typically require extensive training data and complex machine learning models, which increase the computational complexity of the detection. Second, this approach may produce false positives because some legitimate software may have similar behavioral characteristics that are difficult to distinguish. Finally, behavior-based detection methods can present challenges in terms of real-time and efficiency, as analysis of malware behavior requires time, which is often a critical factor in malware attacks. Thus, current malware threat detection areas face increasingly complex and diverse threats, and conventional approaches often fail to provide adequate protection. In order to ensure the security of information systems and data, a novel artificial intelligence method such as deep learning is combined with the research and development of malware threat detection and technology.
Disclosure of Invention
Therefore, it is necessary to provide a method for detecting the threat of lightweight malicious software based on multi-modal feature alignment, which aims to overcome the problems of the conventional method, and by introducing the multi-modal feature alignment technology, the malicious software can be detected more accurately, meanwhile, the computational complexity is reduced, and the real-time performance and the efficiency of the detection are improved by using a lightweight detection model.
To achieve the above object, the present inventors provide a lightweight malware threat detection method based on multi-modal feature alignment, comprising the steps of:
S1, giving log information and a malicious software tag table of a program sample of software to be detected;
S2, analyzing log information of the software to be detected, obtaining relevant fields through regularization to judge association relations, respectively initializing the association relations of the relevant fields as nodes and edges, inputting the nodes and the edges into a graph encoder, embedding the graph encoder into the nodes through information transmission update, and preliminarily outputting different probability labels of the software to be detected;
S3, introducing a malicious software tag to construct a vocabulary, and obtaining an embedded vector after the vocabulary passes through a CLIP encoder;
s4, performing spectral clustering on the nodes and edges obtained in the step S2, and dividing the nodes in the graph into a plurality of clusters, so that the nodes in the same cluster have high similarity, and the nodes among different clusters have low similarity;
S5, obtaining a classified voting label of the software to be detected through a cluster voting prompt realignment algorithm by using the embedded vector obtained in the step S3 and the clustering result obtained in the step S4, and forming a weak label by using the classified voting label and the probability label obtained in the step S2;
S6, establishing a student encoder, updating the student encoder by using an exponential moving average for the graph encoder, inputting log information of the software to be detected into the graph encoder, and predicting by using updated weights of nodes and edges to obtain a software prediction tag to be detected;
s7, calculating the loss of the sample by adopting a real sample label and a maximized boundary method aiming at the weak label obtained in the step S5 and the predicted label obtained in the step S6;
And S8, obtaining a decision result according to the loss of the sample, judging whether an execution program of the software to be detected is judged to be a malicious software threat behavior according to the decision result, and adding the execution program into a training set to perform the next round of detection of other software.
As a preferred mode of the present invention, the log information in step S1 includes: date, timestamp, IP address, file path, user operation, port, and event type.
As a preferred mode of the present invention, step S2 further includes:
s201, transforming an original data field by adopting a regularization method, regularizing the field X into an expression of X' within the same scale range, wherein the expression is as follows:
Wherein X represents an original data field, and X' represents a regularized data field;
S202, selecting related fields as initialization of nodes and edges in regularized data fields, wherein the nodes represent different data fields, the edges represent association relations among the fields, the initialization of the nodes and the edges is used for constructing a graph, wherein the nodes represent different data attributes, the edges represent the association relations among the attributes, the value of each field is taken as a node for each field in log data, and each field F i has regularized value X' i for field F, and the expression initialized by the nodes is:
V={v1,v2,...,vk}
Where V represents a set of nodes, each node V 1 corresponding to field f i, k representing the number of fields;
S203, initializing edges to represent the association relation between fields, wherein an edge exists between each pair of nodes in a fully connected graph mode, initializing the weight of the edge to be a default value, and representing the default value as an adjacency matrix A, wherein A ij represents the weight of the edge between a node v i and a node v j;
S204, updating the embedded representation of the nodes by using a graph encoder, wherein each node V 1 has an embedded vector h i for the obtained node set V and the adjacent matrix A, and the embedded vector is initially set to be the node initialized value, and the updating expression of the graph encoder is as follows:
Wherein, Representing the embedding of node v i at layer i, σ represents the activation function, N (v i) represents the set of neighbor nodes of node v i, c ij represents the normalization constant, typically the sum of the weights of the edges between node v i and its neighbor node v j, and W l represents the weight matrix of layer i for the linear transformation.
S205, obtaining node characteristics of the graph encoderRepresentation for prediction, using an additional fully connected layer to map node features to class probabilities, the expression of this process is:
Wherein z i represents a classification score or probability of node v i, W represents a weight matrix, and b represents a bias vector;
S206, converting the classification score of the node into probability distribution by using a normalization function, wherein the expression is as follows:
label(vi)=argmax P(i/z)
Where P (i/z) represents the probability that it belongs to class i for a given node v i, label (v i) is the label of node v i.
As a preferred mode of the present invention, step S3 further includes:
S301, introducing a malicious software tag for describing different types of malicious software features and behaviors, and constructing the malicious software tag into a malicious software tag vocabulary with multi-modal features;
s302, introducing a CLIP encoder, and processing a malicious software tag vocabulary through the CLIP encoder to obtain an embedded vector associated with the tag, wherein the expression is as follows:
Etext=CE(t)
Where E text denotes the embedded vector with which the tag is associated, t denotes the malware tag, and CE denotes the CLIP encoder.
As a preferred mode of the present invention, step S4 further includes: and calculating the similarity among the nodes, then calculating a similarity graph matrix, and obtaining clusters by adopting a clustering algorithm, wherein the expression is as follows:
Wherein I represents the number of clusters, C i is the I-th cluster center, and S c is the cluster set.
As a preferred mode of the present invention, step S5 further includes:
s501, combining the embedded vector obtained in the step S3 with the clustering result obtained in the step S4, realigning a software sample through a cluster voting prompt, and obtaining a classification voting label of the software, wherein the expression is as follows:
TC=CVP(Etext,Sc)
wherein T c represents a classification voting label of the software sample text, CVP represents a cluster voting prompt algorithm for realignment and classification, E text represents a multi-modal feature embedding vector of the software sample, and S c represents a clustering result.
S502, combining the probability label and the classified voting label in the step S2 to generate a weak label of the software sample.
As a preferred mode of the present invention, step S6 further includes: and establishing a student encoder, which is used for converting the log information of the software to be detected into a characteristic representation, introducing an index moving average method to update the weight of the student encoder, and obtaining a prediction label T s of the self encoder according to the prediction step in the step S2.
As a preferred mode of the present invention, in step S7, for the weak tag of the software sample obtained in step S5 and the predictive tag obtained in step S6, the loss of the sample is calculated using the true sample tag and the maximum boundary method, and the expression is:
Where L (θ) represents a loss function of the model, N represents a total number of samples, Δ represents a minimum interval of boundaries represented by a super parameter for training the model, Δ c (b) represents a weak tag loss of the sample b, which is a difference between the weak tag and the real tag, and Δ s (b) represents a predicted tag loss of the sample b, which is a difference between an output probability tag of the model and the real tag.
Compared with the prior art, the beneficial effects achieved by the technical scheme are as follows:
(1) According to the method, a malicious software threat detection framework is constructed on the basis of a traditional model, structural feature information of malicious software is extracted, unlabeled data is processed through a graph neural network iteration, and pseudo-supervision structural feature information is obtained. And then adopting a cluster voting prompt realignment algorithm, and initially identifying the malicious software category in the vocabulary by an iterative graph clustering method. Meanwhile, a malware text category hint is generated using the CLIP and the malware vocabulary, and the software encoded image and the software text category hint are rearranged into structural alignment. Finally, a lightweight threat detection model is constructed based on teacher and student learning strategies, so that malicious software threats can be effectively identified and detected;
(2) The method realizes the structural analysis of unknown customized malicious software, thereby realizing high-efficiency and light threat detection;
(3) The method has adaptability and intelligence, can better identify and cope with the continuously evolving threats, improves the safety of computers and networks, and reduces the risks of the threats on user data and privacy.
Drawings
FIG. 1 is a training frame diagram of a method according to an embodiment;
Fig. 2 is a detailed flowchart of a process of the method according to the embodiment.
Detailed Description
In order to describe the technical content, constructional features, achieved objects and effects of the technical solution in detail, the following description is made in connection with the specific embodiments in conjunction with the accompanying drawings.
The embodiment provides a lightweight malicious software threat detection method based on multi-mode feature alignment, which takes a teacher-student learning framework as a basic structure of malicious software threat detection analysis, can detect the threat according to log information of malicious software, can form a lightweight model classifier based on a multi-mode feature encoder to fully mine structured information of existing malicious software data, and can realize high-efficiency and low-cost construction of a threat detection framework by analyzing potential malicious software threats. Therefore, personnel interference is not needed, errors caused by human factors are reduced, and the efficiency of network security operation is improved.
As shown in fig. 1 and 2, the method specifically comprises the steps of:
S1, giving log information and a malicious software tag table of a program sample of software to be detected;
S2, analyzing log information of the software to be detected, obtaining relevant fields through regularization to judge association relations, respectively initializing the association relations of the relevant fields as nodes and edges, inputting the nodes and the edges into a graph encoder, embedding the graph encoder into the nodes through information transmission update, and preliminarily outputting different probability labels of the software to be detected;
S3, introducing a malicious software tag to construct a vocabulary, and obtaining an embedded vector after the vocabulary passes through a CLIP encoder;
s4, performing spectral clustering on the nodes and edges obtained in the step S2, and dividing the nodes in the graph into a plurality of clusters, so that the nodes in the same cluster have high similarity, and the nodes among different clusters have low similarity;
S5, obtaining a classified voting label of the software to be detected through a cluster voting prompt realignment algorithm by using the embedded vector obtained in the step S3 and the clustering result obtained in the step S4, and forming a weak label by using the classified voting label and the probability label obtained in the step S2;
S6, establishing a student encoder, updating the student encoder by using an exponential moving average for the graph encoder, inputting log information of the software to be detected into the graph encoder, and predicting by using updated weights of nodes and edges to obtain a software prediction tag to be detected;
s7, calculating the loss of the sample by adopting a real sample label and a maximized boundary method aiming at the weak label obtained in the step S5 and the predicted label obtained in the step S6;
And S8, obtaining a decision result according to the loss of the sample, judging whether an execution program of the software to be detected is judged to be a malicious software threat behavior according to the decision result, and adding the execution program into a training set to perform the next round of detection of other software.
In the implementation process of the above embodiment, for step S2, the log information of the software to be detected is analyzed in detail. Such log information typically includes, but is not limited to, date, time stamp, IP address, file path, user operation, port, and event type. To further process this information, the original data fields are transformed using a regularization method to ensure that the regularized data fields are within the same scale. Specifically, in the present embodiment, the field X is regularized to X' using the following formula:
where X represents an original data field, such as a date, a time stamp, an IP address, etc., and X' represents a regularized data field.
In the regularized data fields in step S2, relevant fields are selected as initialization of nodes and edges, the nodes may represent different data fields, such as date, time stamp, IP address, etc., and the edges represent association relations between the fields. Initialization of nodes and edges will be used to construct a graph in which nodes represent different data attributes and edges represent associations between these attributes. For each field in the log data, the value of each field is taken as a node. For field F, each field F i has a regularized value of X' i, then the node initialization can be expressed as:
V={v1,v2,...,vk}
Where V represents a set of nodes, each node V 1 corresponding to field f i, and k is the number of fields.
The initialization of the edges is used for representing the association relation between the fields; in general, a fully connected graph approach may be used, where there is one edge between each pair of nodes. The weights of the edges may be initialized to some default value, such as: 1. this may be represented as a adjacency matrix a, where a ij represents the weight of the edge between node v i and neighbor node v j.
The embedded representation of the node is updated using a graph encoder. The step S3 is performed to obtain a node set V and an adjacency matrix a. Each node v 1 has an embedded vector h i that can be initially set to the value of the node initialization. The updated formula of the graph encoder can be expressed as:
Wherein, Representing the embedding of node v i at layer i, σ represents the activation function, N (v i) represents the set of neighbor nodes of node v i, c ij represents the normalization constant, typically the sum of the weights of the edges between node v i and its neighbor node v j, and W l represents the weight matrix of layer i for the linear transformation.
In the above embodiment, node characteristics of the graph encoder are obtainedThe expression representing to predict, using an additional fully connected layer to map node features to class probability processes is:
Where z i represents the classification score or probability of node v i, W represents the weight matrix, and b represents the bias vector.
The classification scores of the nodes are obtained, the scores are converted into probability distribution by using a normalization function, and the expression is as follows:
label(vi)=argmax P(i/z)
Where P (i/z) represents the probability that it belongs to class i for a given node v i, label (v i) is the label of node v i.
For step S3 in the above embodiment, specific: to enable multi-modal malware detection, malware tags are first introduced, which are used to describe different types of malware features and behaviors. The purpose of malware label construction is to build a vocabulary of multimodal features to better understand and represent the various characteristics of malware.
In this embodiment, a CLIP encoder is introduced, and the malware tag vocabulary is processed by the CLIP encoder to obtain embedded vectors associated with the tags, expressed as:
Etext=CE(t)
Where E text denotes the embedded vector with which the tag is associated, t denotes the tag of malware, and CE denotes the CLIP encoder.
For step S4 in the above embodiment, specific: the nodes in the graph are divided into a plurality of clusters by adopting a spectral clustering mode, so that the nodes in the same cluster have higher similarity, and the node similarity between different clusters is lower. Specifically: and calculating the similarity among the nodes, then calculating a similarity graph matrix, and obtaining a final cluster by adopting a traditional clustering algorithm, wherein the expression is as follows:
Wherein I is the number of clusters, C i is the I-th cluster center, and S c is a cluster set.
For step S5 in the above embodiment, specific: the clustering of the above steps represents similarities and associations between software samples. Combining the embedded vector in the step S3 and the clustering result in the step S4, and realigning the software samples by a cluster voting hint (CVP) method, specifically: the cluster set S c and the text label E text are used as inputs of a cluster voting prompting method. In view of the semantic clustering result Sc, a vocabulary voting distribution matrix M is calculated, where M represents the probability that E text belongs to each cluster. Taking a clustering result with the highest probability in the matrix M as a weak tag T of the classification voting of the software sample text; and gets the classified voting label of the software, this process can be expressed as the following formula:
Tc=CVP(Etext,Sc)
Wherein T c represents a weak tag of classification voting of the software sample text, CVP is a cluster voting hint algorithm for realignment and classification, E text is a multi-modal feature embedding vector of the software sample, and S c represents a clustering result.
With the help of the realignment algorithm, the present embodiment can reorganize and align the embedded vectors according to these clustering results to better reflect the similarity between software samples. And finally, combining the probability label obtained in the step S2 with the classified voting label to generate a weak label of the software sample. These weak tags reflect the classification information of the software sample and can be used for further threat detection and analysis.
For step S6 in the above embodiment, specific: a student encoder is built for processing the log information of the software, the task of the student encoder being to convert the log information into a characteristic representation for subsequent prediction and classification. An exponential moving average method is introduced to update the weights of the student encoders. EMA is a smooth weight update strategy that helps to improve the stability and generalization performance of the model. The prediction step in step S2 is followed to obtain a prediction tag T s from the encoder.
For step S7 in the above embodiment, specific: weak tags T c and predictive tags T s of software samples need to be processed to calculate the loss of samples. The method aims at calculating loss by combining a real sample label according to a weak label T c and a prediction label T s of a software sample output by a model and using a mode of maximizing a boundary so as to help the model learn a more accurate classification decision boundary and improve the detection efficiency of malicious software threat; the expression for calculating the loss is as follows:
Where L (θ) represents a loss function of the model, N represents a total number of samples, Δ represents a minimum interval of boundaries for the super-parameters for controlling the degree of maximizing the boundaries, Δ c (b) represents a pseudo tag loss of the sample b, which is a difference between a weak tag and a real tag, and Δ s (b) represents a predicted tag loss of the sample b, which is a difference between an output probability tag of the model and the real tag.
For the above embodiment, in order to better use the light weight of the already constructed multi-modal feature alignment model (MFC, multimodal Feature Alignment Model) to perform the detection of the threat behavior of the malicious software, the embodiment also proposes a model for detecting the threat, which constructs a malicious software threat detection framework based on a traditional model, extracts the structural feature information of the malicious software, and iteratively processes the unlabeled data through a graph neural network to obtain the pseudo-supervision structural feature information. And then adopting a cluster voting prompt realignment algorithm, and initially identifying the malicious software category in the vocabulary by an iterative graph clustering method. Meanwhile, a malware text category hint is generated using the CLIP and the malware vocabulary, and the software encoded image and the software text category hint are rearranged into structural alignment. Finally, a lightweight threat detection model is built based on teacher and student learning strategies to effectively identify and detect malware threats, and rapid and efficient threat detection is never achieved.
To verify the performance of the model, based on the above embodiments, the present embodiment tests the model on a generic vulnerability disclosure library (Common Vulnerabilities and Exposures, CVE), a signature dataset (AposematIoT-23) of malicious and benign internet of things network traffic, an intrusion detection dataset (ADFA) in combination with an on-network disclosed emergency response handling method. The performance of the small sample-based learning model in malware detection and defense under an evaluation system based on accuracy, recall and F1 values is shown in table 1 from three aspects:
Table 1: performance comparison table of small sample learning model in malware detection and defense
According to the analysis of the actual results, on the basis of the malware detection effect on the same dataset, compared with other model methods, the comparison learning based on a small sample learning model has higher promotion in the aspects of malware detection and defense, and in the transverse comparison, different models are used for comparison on ADFA datasets, and compared with the framework of a basic malware network base, the framework is as follows: long Short-Term Memory (LSTM), bidirectional Long-Term Memory (Bidirectional Long Short-Term Memory, biLSTM), gate-controlled Memory (Gated Recurrent Unit, GRU) and other models are added to detect the attack behavior of the malicious software based on a few-sample comparison learning classifier, and the recognition accuracy, recall rate and F1 value are shown in Table 2:
Table 2: comparison table of accuracy rate, recall rate and F1 value of small sample learning model
It can be seen that the best performance is respectively improved by 8.82%, 5.70% and 7.70%, and the average level in the industry is obviously improved. And under different scenes of malware detection, the requirement for the sample size of the attack behaviors of the malware is also greatly reduced. Therefore, in the embodiment, the multi-modal feature aligned lightweight model provided by the method can detect the malicious software under the condition of few malicious software samples, and the analysis shows that the multi-modal feature aligned lightweight model can effectively detect the threat of the malicious software by combining with the introduction of a teacher-student policy method.
In this embodiment, the whole flow framework shown in fig. 1 needs to be trained in advance, and the prediction modes of the training phase and the testing phase are the same as follows:
Pretraining with MALWARE TRAINING SETS (malware training set): the pre-training task is performed by using a teacher-student strategy learning mode, the two branches simultaneously predict the labels of the software through a graph encoder and a self-encoder, the graph encoder outputs predicted features by using a mode of aligning two modes of a graph and a text, and the threat types of the software are classified according to the features. Meanwhile, the graph encoder updates the self-encoder in a moving index average manner, and the labels output from the encoder and the alignment labels output from the graph encoder calculate losses in a manner of maximizing boundaries, so that parameters of the model are iteratively updated.
After the pre-training is completed, the network model is trimmed 15000 times with an open source dataset MALWARE TRAINING SETS (malware training set).
In this embodiment, the network model is initialized with random parameters, and the maximum boundary is used for the final loss calculation, using adamW optimizers, with default setting momentum β 1=0.9,β2 =0.999, and dropout set to 0.1.
The maximum length of the input log sequence is 256, the trained batch is set to 16, and the learning rate is setSelf-encoderTraining is started to 5000 times, then descending is started, training is started to 10000 times, the L2 attenuation parameter is 0.01, and the parameters of the backbone network are fixed at the moment and do not participate in training. In the prediction phase, the software is classified using only the self-encoder branches, the number of nodes in the self-encoder is set to 128, the weight decay is set to 0.015, the minimum interval delta of the boundaries is 0.3, and the same configuration is adopted in the training and reasoning phase.
It should be noted that, although the foregoing embodiments have been described herein, the scope of the present invention is not limited thereby. Therefore, based on the innovative concepts of the present invention, alterations and modifications to the embodiments described herein, or equivalent structures or equivalent flow transformations made by the present description and drawings, apply the above technical solution, directly or indirectly, to other relevant technical fields, all of which are included in the scope of the invention.

Claims (2)

1. The lightweight malicious software threat detection method based on multi-modal feature alignment is characterized by comprising the following steps:
S1, giving log information and a malicious software tag table of a program sample of software to be detected;
S2, analyzing log information of the software to be detected, obtaining relevant fields through regularization to judge association relations, respectively initializing the association relations of the relevant fields as nodes and edges, inputting the nodes and the edges into a graph encoder, embedding the graph encoder into the nodes through information transmission update, and preliminarily outputting different probability labels of the software to be detected;
S3, introducing a malicious software tag to construct a vocabulary, and obtaining an embedded vector after the vocabulary passes through a CLIP encoder;
s4, performing spectral clustering on the nodes and edges obtained in the step S2, and dividing the nodes in the graph into a plurality of clusters, so that the nodes in the same cluster have high similarity, and the nodes among different clusters have low similarity;
S5, obtaining a classified voting label of the software to be detected through a cluster voting prompt realignment algorithm by using the embedded vector obtained in the step S3 and the clustering result obtained in the step S4, and forming a weak label by using the classified voting label and the probability label obtained in the step S2;
S6, establishing a student encoder, updating the student encoder by using an exponential moving average for the graph encoder, inputting log information of the software to be detected into the graph encoder, and predicting by using updated weights of nodes and edges to obtain a software prediction tag to be detected;
s7, calculating the loss of the sample by adopting a real sample label and a maximized boundary method aiming at the weak label obtained in the step S5 and the predicted label obtained in the step S6;
s8, obtaining a decision result according to the loss of the sample, judging whether an execution program of the software to be detected is judged to be a malicious software threat behavior according to the decision result, and adding the execution program into a training set to perform the next round of detection of other software;
Step S2 further includes:
s201, transforming an original data field by adopting a regularization method, regularizing the field X into an expression of X' within the same scale range, wherein the expression is as follows:
Wherein X represents an original data field, and X' represents a regularized data field;
S202, selecting related fields as initialization of nodes and edges in regularized data fields, wherein the nodes represent different data fields, the edges represent association relations among the fields, the initialization of the nodes and the edges is used for constructing a graph, wherein the nodes represent different data attributes, the edges represent the association relations among the attributes, the value of each field is taken as a node for each field in log data, and each field F i has regularized value X' i for field F, and the expression initialized by the nodes is:
V={v1,v2,...,vk}
Where V represents a set of nodes, each node V 1 corresponding to field f i, k representing the number of fields;
S203, initializing edges to represent the association relation between fields, wherein an edge exists between each pair of nodes in a fully connected graph mode, initializing the weight of the edge to be a default value, and representing the default value as an adjacency matrix A, wherein A ij represents the weight of the edge between a node v i and a node v j;
S204, updating the embedded representation of the nodes by using a graph encoder, wherein each node V 1 has an embedded vector h i for the obtained node set V and the adjacent matrix A, and the embedded vector is initially set to be the node initialized value, and the updating expression of the graph encoder is as follows:
Wherein, Representing the embedding of node v i at layer i, σ represents the activation function, N (v i) represents the set of neighbor nodes of node v i, c ij represents the normalization constant, typically the sum of the weights of the edges between node v i and its neighbor nodes v j, W l represents the weight matrix of layer i for the linear transformation;
s205, obtaining node characteristics of the graph encoder Representation for prediction, using an additional fully connected layer to map node features to class probabilities, the expression of this process is:
Wherein Z i represents a classification score or probability of node v i, W represents a weight matrix, and b represents a bias vector;
S206, converting the classification score of the node into probability distribution by using a normalization function, wherein the expression is as follows:
label(vi)=argmax P(i/z)
Wherein P (i/z) represents the probability that it belongs to class i for a given node v i, label (v i) is the label of node v i;
Step S3 further includes:
S301, introducing a malicious software tag for describing different types of malicious software features and behaviors, and constructing the malicious software tag into a malicious software tag vocabulary with multi-modal features;
s302, introducing a CLIP encoder, and processing a malicious software tag vocabulary through the CLIP encoder to obtain an embedded vector associated with the tag, wherein the expression is as follows:
Etext=CE(t)
Wherein E text represents the embedded vector associated with the tag, t represents the malware tag, CE represents the CLIP encoder;
step S4 further includes: and calculating the similarity among the nodes, then calculating a similarity graph matrix, and obtaining clusters by adopting a clustering algorithm, wherein the expression is as follows:
Wherein, I represents the number of clusters, C i is the I-th cluster center, S c is a cluster set;
step S5 further includes:
s501, combining the embedded vector obtained in the step S3 with the clustering result obtained in the step S4, realigning a software sample through a cluster voting prompt, and obtaining a classification voting label of the software, wherein the expression is as follows:
Tc=CVP(Etext,Sc)
Wherein, T c represents a classification voting label of the software sample text, CVP represents a cluster voting prompt algorithm for realignment and classification, E text represents a multi-modal feature embedding vector of the software sample, and S c represents a clustering result;
S502, combining the probability label and the classified voting label in the step S2 to generate a weak label of a software sample;
Step S6 further includes: establishing a student encoder, which is used for converting log information of software to be detected into characteristic representation, introducing an index moving average method to update the weight of the student encoder, and obtaining a prediction label T s of the self encoder according to a prediction step in the step S2;
In step S7, for the weak label of the software sample obtained in step S5 and the prediction label obtained in step S6, the loss of the sample is calculated by using the real sample label and the maximum boundary method, and the expression is:
Where L (θ) represents a loss function of the model, N represents a total number of samples, Δ represents a minimum interval of boundaries represented by a super parameter for training the model, Δ c (b) represents a weak tag loss of the sample b, which is a difference between the weak tag and the real tag, and Δ s (b) represents a predicted tag loss of the sample b, which is a difference between an output probability tag of the model and the real tag.
2. The multi-modal feature alignment-based lightweight malware threat detection method of claim 1, wherein the log information in step S1 comprises: date, timestamp, IP address, file path, user operation, port, and event type.
CN202410086383.2A 2024-01-22 2024-01-22 Multi-mode feature alignment-based lightweight malicious software threat detection method Active CN117610002B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410086383.2A CN117610002B (en) 2024-01-22 2024-01-22 Multi-mode feature alignment-based lightweight malicious software threat detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410086383.2A CN117610002B (en) 2024-01-22 2024-01-22 Multi-mode feature alignment-based lightweight malicious software threat detection method

Publications (2)

Publication Number Publication Date
CN117610002A CN117610002A (en) 2024-02-27
CN117610002B true CN117610002B (en) 2024-04-30

Family

ID=89956493

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410086383.2A Active CN117610002B (en) 2024-01-22 2024-01-22 Multi-mode feature alignment-based lightweight malicious software threat detection method

Country Status (1)

Country Link
CN (1) CN117610002B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113779581A (en) * 2021-09-15 2021-12-10 山东省计算中心(国家超级计算济南中心) Robust detection method and system for lightweight high-precision malicious software identification model
CN113821799A (en) * 2021-09-07 2021-12-21 南京邮电大学 Multi-label classification method for malicious software based on graph convolution neural network
WO2022162379A1 (en) * 2021-01-29 2022-08-04 Glasswall (Ip) Limited Machine learning methods and systems for determining file risk using content disarm and reconstruction analysis
CN115129896A (en) * 2022-08-23 2022-09-30 南京众智维信息科技有限公司 Network security emergency response knowledge graph relation extraction method based on comparison learning
CN115375781A (en) * 2022-07-20 2022-11-22 华为技术有限公司 Data processing method and device
CN116541838A (en) * 2023-04-19 2023-08-04 杭州电子科技大学 Malware detection method based on contrast learning
CN116610962A (en) * 2023-04-25 2023-08-18 上海任意门科技有限公司 Content auditing method and device, electronic equipment and storage medium
CN117094000A (en) * 2023-08-08 2023-11-21 合肥工业大学 Multi-modal migration anti-attack method oriented to vision-language pre-training model
CN117216741A (en) * 2023-07-28 2023-12-12 武汉盛信鸿通科技有限公司 Multimode sample implantation method based on contrast learning system
CN117235742A (en) * 2023-11-13 2023-12-15 中国人民解放军国防科技大学 Intelligent penetration test method and system based on deep reinforcement learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11714905B2 (en) * 2019-05-10 2023-08-01 Sophos Limited Attribute relevance tagging in malware recognition
US20220129556A1 (en) * 2020-10-28 2022-04-28 Facebook, Inc. Systems and Methods for Implementing Smart Assistant Systems

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022162379A1 (en) * 2021-01-29 2022-08-04 Glasswall (Ip) Limited Machine learning methods and systems for determining file risk using content disarm and reconstruction analysis
CN113821799A (en) * 2021-09-07 2021-12-21 南京邮电大学 Multi-label classification method for malicious software based on graph convolution neural network
CN113779581A (en) * 2021-09-15 2021-12-10 山东省计算中心(国家超级计算济南中心) Robust detection method and system for lightweight high-precision malicious software identification model
CN115375781A (en) * 2022-07-20 2022-11-22 华为技术有限公司 Data processing method and device
CN115129896A (en) * 2022-08-23 2022-09-30 南京众智维信息科技有限公司 Network security emergency response knowledge graph relation extraction method based on comparison learning
CN116541838A (en) * 2023-04-19 2023-08-04 杭州电子科技大学 Malware detection method based on contrast learning
CN116610962A (en) * 2023-04-25 2023-08-18 上海任意门科技有限公司 Content auditing method and device, electronic equipment and storage medium
CN117216741A (en) * 2023-07-28 2023-12-12 武汉盛信鸿通科技有限公司 Multimode sample implantation method based on contrast learning system
CN117094000A (en) * 2023-08-08 2023-11-21 合肥工业大学 Multi-modal migration anti-attack method oriented to vision-language pre-training model
CN117235742A (en) * 2023-11-13 2023-12-15 中国人民解放军国防科技大学 Intelligent penetration test method and system based on deep reinforcement learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Learning Privacy-Preserving Student Networks via Discriminative-Generative Distillation";S. Ge等;IEEE Transactions on Image Processing;20221207;第32卷;116-127 *
"融合视觉词与自注意力机制的视频目标分割";陈亚当等;中国图象图形学报;20220812;2444-2457 *
深度学习中对抗样本的构造及防御研究;段广晗;马春光;宋蕾;武朋;;网络与信息安全学报;20200323(第02期);1-11 *

Also Published As

Publication number Publication date
CN117610002A (en) 2024-02-27

Similar Documents

Publication Publication Date Title
Li et al. LSTM-based SQL injection detection method for intelligent transportation system
Yuan et al. Adversarial examples: Attacks and defenses for deep learning
Sun et al. Deep learning and visualization for identifying malware families
Wu et al. A network intrusion detection method based on semantic Re-encoding and deep learning
Liu et al. ATMPA: attacking machine learning-based malware visualization detection methods via adversarial examples
Kong et al. A survey on adversarial attack in the age of artificial intelligence
Kuppa et al. Linking cve’s to mitre att&ck techniques
Liu et al. Locate-Then-Detect: Real-time Web Attack Detection via Attention-based Deep Neural Networks.
Wang et al. Res-TranBiLSTM: An intelligent approach for intrusion detection in the Internet of Things
Macas et al. Adversarial examples: A survey of attacks and defenses in deep learning-enabled cybersecurity systems
Kheddar et al. Deep transfer learning applications in intrusion detection systems: A comprehensive review
Yuan et al. A data balancing approach based on generative adversarial network
Liu et al. FewM-HGCL: Few-shot malware variants detection via heterogeneous graph contrastive learning
He et al. Detection of Malicious PDF Files Using a Two‐Stage Machine Learning Algorithm
CN113783852B (en) Intelligent contract Pompe fraudster detection algorithm based on neural network
Xue Machine Learning: Research on Detection of Network Security Vulnerabilities by Extracting and Matching Features
Malik et al. Securing the Internet of Things using machine learning: a review
CN117610002B (en) Multi-mode feature alignment-based lightweight malicious software threat detection method
Singh et al. Deep learning framework for cybersecurity: Framework, applications, and future research trends
Deekshitha et al. URL Based Phishing Website Detection by Using Gradient and Catboost Algorithms
Liu et al. A new approach of user-level intrusion detection with command sequence-to-sequence model
Parameswari et al. Hybrid rat swarm hunter prey optimization trained deep learning for network intrusion detection using CNN features
Chen et al. Mobile cellular network security vulnerability detection using machine learning
Stokes et al. Detection of prevalent malware families with deep learning
Dai et al. [Retracted] Anticoncept Drift Method for Malware Detector Based on Generative Adversarial Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant