CN112115465B - Method and system for detecting typical attack behavior of malicious code - Google Patents

Method and system for detecting typical attack behavior of malicious code Download PDF

Info

Publication number
CN112115465B
CN112115465B CN202010826647.5A CN202010826647A CN112115465B CN 112115465 B CN112115465 B CN 112115465B CN 202010826647 A CN202010826647 A CN 202010826647A CN 112115465 B CN112115465 B CN 112115465B
Authority
CN
China
Prior art keywords
api
ontology knowledge
sequence
string
ontology
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010826647.5A
Other languages
Chinese (zh)
Other versions
CN112115465A (en
Inventor
薛静锋
韩伟杰
王勇
张继
单纯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Peoples Liberation Army Strategic Support Force Aerospace Engineering University
Original Assignee
Beijing Institute of Technology BIT
Peoples Liberation Army Strategic Support Force Aerospace Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT, Peoples Liberation Army Strategic Support Force Aerospace Engineering University filed Critical Beijing Institute of Technology BIT
Priority to CN202010826647.5A priority Critical patent/CN112115465B/en
Publication of CN112115465A publication Critical patent/CN112115465A/en
Application granted granted Critical
Publication of CN112115465B publication Critical patent/CN112115465B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/52Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
    • G06F21/53Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow by executing in a restricted environment, e.g. sandbox or secure virtual machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a typical malicious code attack behavior detection method and system, belongs to the technical field of network security, and can realize comprehensive characterization of a typical malicious behavior attack process of a malicious code. The technical scheme of the invention is as follows: and running malicious codes in the sandbox environment, and extracting a dynamic system call API sequence and an original ontology knowledge sequence from the generated dynamic analysis report. And calculating the classification contribution degree aiming at each API, and sequencing according to the classification contribution degree from large to small to obtain a malicious sequencing sequence. And sequentially selecting the APIs as search starting points, finding a position A where the search starting point is located in the original ontology knowledge sequence, respectively performing forward traversal search and backward traversal search starting from the position A in the original ontology knowledge sequence, and extracting an ontology knowledge tuple corresponding to the API which belongs to the same behavior type as the search starting point to form an ontology knowledge string. And taking the typical attack behavior of the malicious codes represented by the ontology knowledge string as a detection result.

Description

Method and system for detecting typical attack behavior of malicious code
Technical Field
The invention relates to the technical field of network security, in particular to a method and a system for detecting typical attack behaviors of malicious codes.
Background
Under the current network space environment, malicious codes become weapons which are mainly relied on by attackers to launch network attacks, the attack mechanism is more complex, and the destruction function is more powerful, so that the malicious codes become the main threat to the network space. For serious challenges caused by malicious codes, researchers mainly adopt a machine learning method to carry out automatic analysis and detection. In the detection process, researchers mainly extract relevant features of malicious codes in a static, dynamic or mixed analysis mode, and then train a classifier by adopting a machine learning method to carry out automatic detection and classification.
The current research work aiming at the malicious codes mainly focuses on the accurate detection of the malicious codes, namely, the judgment result of whether the malicious codes are malicious or not is given finally by extracting the relevant characteristics of the malicious codes. While current research enables effective detection of malicious code, significant shortcomings remain in establishing a thorough understanding and appreciation of malicious code.
Because the current research only provides a result for judging whether a program is a malicious code, and the attack process of typical malicious behaviors of the malicious code is not comprehensively analyzed, the mining and cognition of the typical attack behaviors of the malicious code are lacked, the malicious code is difficult to be comprehensively understood, and the targeted protective measures are not easy to be formulated.
Disclosure of Invention
In view of this, the invention provides a method and a system for detecting typical attack behaviors of malicious codes, which can realize comprehensive characterization of the attack process of typical malicious behaviors of malicious codes and realize comprehensive mining and cognition of typical attack behaviors of malicious codes by constructing ontology knowledge strings for characterizing typical attack behaviors of malicious codes.
In order to achieve the purpose, the technical scheme of the invention comprises the following steps:
s1, running malicious codes in a sandbox environment, and extracting a dynamic system call API sequence and an original ontology knowledge sequence from a generated dynamic analysis report.
S2, calculating a classification contribution degree aiming at each API in the dynamic system call API sequence, and sequencing according to the classification contribution degrees from large to small, namely sequencing the malice to obtain a malice sequencing sequence
Figure GDA0002732121450000021
S3, according to the malicious sequence
Figure GDA0002732121450000022
And sequentially selecting the APIs as search starting points.
And S4, finding a position A where the search starting point is located in the original ontology knowledge sequence, respectively performing forward traversal search and backward traversal search starting from the position A in the original ontology knowledge sequence, extracting an ontology knowledge tuple corresponding to the API which belongs to the same behavior type as the search starting point, and forming an ontology knowledge string.
And S5, taking typical attack behaviors of the malicious codes represented by the ontology knowledge string as detection results.
Further, the dynamic system call API sequence comprises all APIs called by the system in the running process of the malicious code; the original ontology knowledge sequence consists of ontology knowledge tuples corresponding to each API; each ontology tuple contains the API and its operands.
Further, the forward traversal search specifically includes: and taking the API corresponding to the previous position of the position A, if the behavior type of the API corresponding to the previous position is consistent with the behavior type of the API corresponding to the search starting point, adding the ontology tuple of the API corresponding to the previous position into the forward ontology sub-string, updating the position A to be the previous position, and repeatedly executing forward traversal search until the behavior type of the API corresponding to the previous position is inconsistent with the behavior type of the API corresponding to the search starting point.
The backward traversal search specifically includes: and taking the API corresponding to the next position of the position A, if the behavior type of the API corresponding to the next position is consistent with the behavior type of the API corresponding to the search starting point, adding the ontology knowledge tuple of the API corresponding to the next position into the backward ontology knowledge sub-string, updating the position A to be the next position, and repeatedly executing backward traversal search until the behavior type of the API corresponding to the next position is inconsistent with the behavior type of the API corresponding to the search starting point.
And after the forward traversal search and the backward traversal search are finished, combining the obtained forward ontology knowledge substring and the obtained backward ontology knowledge substring into an ontology knowledge string.
Further, in S3, sequentially taking a malicious ith API as a search starting point, with an initial value of i being 1; in S4, judging the number of ontology knowledge groups in the ontology knowledge string is added, if the number of ontology knowledge groups in the ontology knowledge string is smaller than a set threshold value, i is increased by 1, and the step returns to S3; the setting threshold is set empirically.
Further, the behavior types of the API mainly include file operations, system operations, process/thread operations, registry operations, storage operations, kernel operations, network operations, device operations, window operations, and text operations.
Another embodiment of the invention provides a typical attack behavior detection system for malicious codes, which comprises a data acquisition module, a data preprocessing module, an ontology knowledge string extraction module and a behavior detection module;
and the data acquisition module is used for operating malicious codes in the sandbox environment, extracting a dynamic system call API sequence and an original ontology knowledge sequence from the generated dynamic analysis report and sending the dynamic system call API sequence and the original ontology knowledge sequence to the data preprocessing module.
A data preprocessing module used for calculating the classification contribution degree aiming at each API in the dynamic system call API sequence and sequencing according to the classification contribution degree from large to small, namely sequencing the malice to obtain a malice sequencing sequence
Figure GDA0002732121450000031
And sending the information into an ontology knowledge string extraction module.
An ontology string extraction module for extracting the ontology string according to the malicious sequence
Figure GDA0002732121450000032
Sequentially selecting APIs as search starting points; and finding a position A where the search starting point is located in the original ontology knowledge sequence, respectively performing forward traversal search and backward traversal search in the original ontology knowledge sequence from the position A, extracting an ontology knowledge tuple corresponding to an API (application programming interface) which belongs to one behavior type with the search starting point to form an ontology knowledge string, and sending the ontology knowledge string to a behavior detection module.
And the behavior detection module is used for taking the typical attack behavior of the malicious code represented by the ontology knowledge string as a detection result.
Further, the dynamic system call API sequence comprises all APIs called by the system in the running process of the malicious code; the original ontology knowledge sequence consists of ontology knowledge tuples corresponding to each API; each ontology tuple contains the API and its operands.
Further, the forward traversal search specifically includes: and taking the API corresponding to the previous position of the position A, if the behavior type of the API corresponding to the previous position is consistent with the behavior type of the API corresponding to the search starting point, adding the ontology tuple of the API corresponding to the previous position into the forward ontology sub-string, updating the position A to be the previous position, and repeatedly executing forward traversal search until the behavior type of the API corresponding to the previous position is inconsistent with the behavior type of the API corresponding to the search starting point.
The backward traversal search specifically includes: taking the API corresponding to the next position of the position A, if the behavior type of the API corresponding to the next position is consistent with the behavior type of the API corresponding to the search starting point, adding the ontology knowledge tuple of the API corresponding to the next position into the backward ontology knowledge sub-string, updating the position A to be the next position, and repeatedly executing backward traversal search until the behavior type of the API corresponding to the next position is inconsistent with the behavior type of the API corresponding to the search starting point;
and after the forward traversal search and the backward traversal search are finished, combining the obtained forward ontology knowledge substring and the obtained backward ontology knowledge substring into an ontology knowledge string.
Further, in the ontology knowledge string extraction module, a malicious ith API is sequentially taken as a search starting point, and an initial value of i is 1; and in the ontology knowledge string extraction module, judging the number of ontology knowledge groups in the ontology knowledge string is added, if the number of ontology knowledge groups in the ontology knowledge string is less than a set threshold value, i is increased by 1, a search starting point is updated, a new ontology knowledge string is obtained again, and the new ontology knowledge string is sent to the behavior detection module. The setting threshold is set empirically.
Further, the behavior types of the API mainly include file operations, system operations, process/thread operations, registry operations, storage operations, kernel operations, network operations, device operations, window operations, and text operations.
Has the advantages that:
according to the detection method and system for typical attack behaviors of the malicious code, provided by the invention, based on the basis that the behavior characteristics of the malicious code can be effectively represented by dynamic system calling information, dynamic analysis is carried out on the malicious code, a dynamic system calling API sequence is extracted, the contribution degree of the API is calculated, and the API sequence is sequenced; in addition, the characteristics of the program behavior process can be effectively described based on ontology knowledge, and an ontology model is introduced to construct a knowledge representation framework of malicious codes; on the basis, traversing the ontology knowledge sequence of the malicious code based on the classification contribution degree of the dynamic system call and the behavior type information of the API, and extracting a meaningful ontology knowledge string from the original ontology knowledge sequence of the malicious code. The extracted ontology knowledge string can effectively reflect the implementation process of typical malicious behaviors of the malicious codes, an ontology knowledge representation framework of the typical malicious behaviors of the malicious codes is built, and system cognition of typical attack behaviors of the malicious codes is achieved.
Drawings
Fig. 1 is a flowchart of a typical attack behavior detection method for malicious code according to an embodiment of the present invention;
FIG. 2 is a process for generating an ontology knowledge sequence based on API and ontology knowledge association provided by an embodiment of the present invention;
fig. 3 is a block diagram of a typical attack behavior detection system for malicious code according to another embodiment of the present invention.
Detailed Description
The invention is described in detail below by way of example with reference to the accompanying drawings.
The invention provides a typical attack behavior detection method for malicious codes, and the flow of the typical attack behavior detection method is shown in figure 1.
The principle of the invention is as follows:
the API information can effectively characterize the behavior characteristics of the program, and is therefore often used to characterize the behavior characteristics of the program. In addition, research finds that the API with more obvious maliciousness can be found out to describe the program behavior characteristics by calculating the classification contribution degree of the API, and the API with higher classification contribution degree has more obvious maliciousness expression. And the research finds that the malicious program usually continuously executes the same type of system call in the process of executing the malicious operation. That is, those sequences of consecutive system calls will often be a concrete manifestation of typical malicious operations.
Therefore, the invention firstly calculates the classification contribution degree of the system calling API, selects the API with higher classification contribution degree, and then carries out traversal search on the original ontology knowledge sequence based on the behavior type of the API. The behavior types of the API mainly include file operation, system operation, process/thread operation, registry operation, storage operation, kernel operation, network operation, device operation, window operation, text operation, and the like. According to the behavior type of the API, the ontology knowledge substrings belonging to the same behavior type are selected from the ontology knowledge sequence, and the extracted ontology knowledge substrings can represent a complete typical attack behavior operation process of the malicious codes.
As shown in fig. 1, the method comprises the following steps:
s1, running malicious codes in a sandbox environment, and extracting a dynamic system call API sequence and an original ontology knowledge sequence from a generated dynamic analysis report.
The dynamic system call API sequence includes all APIs that the system calls during the execution of the malicious code.
For example: the dynamic system call API sequence may be expressed as:
Figure GDA0002732121450000061
api 1 is the ith API in the API sequence; and n is the total number of the API in the API sequence. (ii) a
The original ontology knowledge sequence consists of ontology knowledge tuples corresponding to each API; each ontology tuple contains the API and its operands. That is, each tuple represents the operation information of one API in the API sequence of the malicious code, and the tuple formalization is represented as follows:
Onto i =[api i ,object i ](1≤i≤n)
wherein, api i Representing the ith A in an API sequencePI,object i Represents api i The operation object of (1). Thus, an ontology knowledge sequence corresponding to the dynamic system call API sequence may be established as follows:
Figure GDA0002732121450000062
s2, calculating a classification contribution degree aiming at each API in the dynamic system call API sequence, and sequencing according to the classification contribution degrees from large to small, namely sequencing the malice to obtain a malice sequencing sequence
Figure GDA0002732121450000071
Figure GDA0002732121450000072
S3, according to the malicious sequence
Figure GDA0002732121450000073
And sequentially selecting the APIs as search starting points.
And S4, finding a position A where the search starting point is located in the original ontology knowledge sequence, respectively performing forward traversal search and backward traversal search starting from the position A in the original ontology knowledge sequence, and extracting an ontology knowledge element group corresponding to the API which belongs to the same behavior type as the search starting point to form an ontology knowledge string.
That is, malicious code will usually continuously execute the same type of system call during the course of executing malicious operations. That is, those sequences of consecutive system calls will often be a concrete manifestation of typical malicious operations. Therefore, in order to generate a meaningful ontology knowledge sequence, the ontology knowledge string is constructed by extracting ontology knowledge tuples corresponding to the API belonging to the same behavior type from the originally generated ontology knowledge sequence based on the classification contribution degree and the behavior type information of the API, and the extracted ontology knowledge string can accurately reflect the operation process of typical malicious behaviors of a program and establish an ontology knowledge representation framework of the typical malicious behaviors of malicious codes.
In the embodiment of the present invention, the forward traversal search specifically includes:
and taking the API corresponding to the previous position of the position A, if the behavior type of the API corresponding to the previous position is consistent with the behavior type of the API corresponding to the search starting point, adding the ontology knowledge tuple of the API corresponding to the previous position into the forward ontology knowledge sub-string, updating the position A to be the previous position, and repeatedly executing forward traversal search until the behavior type of the API corresponding to the previous position is inconsistent with the behavior type of the API corresponding to the search starting point.
The backward traversal search specifically includes:
and taking the API corresponding to the next position of the position A, if the behavior type of the API corresponding to the next position is consistent with the behavior type of the API corresponding to the search starting point, adding the ontology knowledge tuple of the API corresponding to the next position into the backward ontology knowledge sub-string, updating the position A to be the next position, and repeatedly executing backward traversal search until the behavior type of the API corresponding to the next position is inconsistent with the behavior type of the API corresponding to the search starting point.
And after the forward traversal search and the backward traversal search are finished, combining the obtained forward ontology knowledge substrings and the obtained backward ontology knowledge substrings into an ontology knowledge string.
And S5, taking the typical attack behavior of the malicious code represented by the ontology knowledge string as a detection result. And based on the ontology knowledge string extracted in the S4, the implementation process of the typical attack behavior of the malicious code is effectively represented, the typical attack behavior of the malicious code is represented, and a researcher is assisted to establish system cognition of the typical attack behavior of the malicious code.
In the embodiment of the present invention, for an API with high maliciousness, if an ontology knowledge string extracted by the API with high maliciousness contains fewer ontology tuples, detection of typical attack behaviors of a malicious code may not be performed, so in S3, in an embodiment of the present invention, a maliciousness ith API is sequentially taken as a search starting point, and an i initial value is 1; in S4, judging the number of ontology knowledge groups in the ontology knowledge string is added, if the number of ontology knowledge groups in the ontology knowledge string is smaller than a set threshold value, i is increased by 1, and the step returns to S3; the setting threshold is empirically set, and may be set to a small value such as 3 or 4, for example.
A specific example of generating an ontology string is shown in fig. 2. The ontology knowledge string represents the process by which malicious code generates a malicious executable file. The extraction process of this specific example is explained in detail as follows:
(1) And selecting the SetFilePointer with higher classification contribution degree as the current analysis API based on the classification contribution degree, wherein the behavior type of the SetFilePointer belongs to the file operation class. Finding an ontology knowledge sentence corresponding to the SetFilePointer in the original ontology knowledge sequence, and then traversing the original ontology knowledge sequence in the forward direction and the backward direction;
(2) In the forward traversal process, the behavior types of GetFileType, ntCreateFile and SetFilePointer are found to be consistent, so ontology statements corresponding to the APIs are added into a forward ontology string;
(3) In the backward traversal process, behavior types of NtAllocateVirtualMemroy, ntTaaddFile, ntCreateFile, getFileType and NtWriteFile are found to be consistent with SetFilePointer, and ontology knowledge sentences corresponding to the APIs are added into backward ontology knowledge sub-strings;
(4) And combining the forward ontology knowledge substring and the backward ontology knowledge substring to form a complete ontology knowledge representation substring.
(5) In the specific analysis process, manual support is also needed, and we find that in the ontology knowledge sequence of the sample, process operation is connected after the file operation, and the operation purpose is to execute the malicious file created in the file operation process. Therefore, the process operation and the file operation process are combined to form a complete malicious behavior process.
Another embodiment of the present invention further provides a typical attack behavior detection system for malicious codes, which is shown in fig. 3 and includes a data acquisition module, a data preprocessing module, an ontology string extraction module, and a behavior detection module.
And the data acquisition module is used for operating malicious codes in the sandbox environment, extracting a dynamic system call API sequence and an original body knowledge sequence from the generated dynamic analysis report and sending the dynamic system call API sequence and the original body knowledge sequence to the data preprocessing module.
A data preprocessing module, configured to calculate a classification contribution degree for each API in the dynamic system call API sequence, and sort the APIs according to the classification contribution degrees from small to large, that is, sort the APIs by malicious intent, so as to obtain a malicious intent sort sequence
Figure GDA0002732121450000091
And sending the information to an ontology knowledge string extraction module.
An ontology knowledge string extraction module for extracting the ontology knowledge string according to the malicious sequence
Figure GDA0002732121450000092
Sequentially selecting APIs as search starting points; and finding a position A where the search starting point is located in the original ontology knowledge sequence, respectively performing forward traversal search and backward traversal search starting from the position A in the original ontology knowledge sequence, extracting an ontology knowledge tuple corresponding to the API which belongs to the same behavior type as the search starting point to form an ontology knowledge string, and sending the ontology knowledge string to a behavior detection module.
And the behavior detection module is used for taking the typical attack behavior of the malicious code represented by the ontology string as a detection result.
In the embodiment of the invention, the dynamic system call API sequence comprises all APIs called by the system in the running process of the malicious code; the original ontology knowledge sequence consists of ontology knowledge tuples corresponding to each API; each ontology tuple contains the API and its operands.
In the embodiment of the present invention, the forward traversal search specifically includes: and taking the API corresponding to the previous position of the position A, if the behavior type of the API corresponding to the previous position is consistent with the behavior type of the API corresponding to the search starting point, adding the ontology knowledge tuple of the API corresponding to the previous position into the forward ontology knowledge sub-string, updating the position A to be the previous position, and repeatedly executing forward traversal search until the behavior type of the API corresponding to the previous position is inconsistent with the behavior type of the API corresponding to the search starting point.
The backward traversal search specifically includes: and taking the API corresponding to the next position of the position A, if the behavior type of the API corresponding to the next position is consistent with the behavior type of the API corresponding to the search starting point, adding the ontology knowledge tuple of the API corresponding to the next position into the backward ontology knowledge sub-string, updating the position A to be the next position, and repeatedly executing backward traversal search until the behavior type of the API corresponding to the next position is inconsistent with the behavior type of the API corresponding to the search starting point.
And after the forward traversal search and the backward traversal search are finished, combining the obtained forward ontology knowledge substring and the obtained backward ontology knowledge substring into an ontology knowledge string.
In the embodiment of the invention, in an ontology knowledge string extraction module, malicious ith bit APIs are sequentially taken as search starting points, and an initial value of i is 1; the body knowledge string extraction module is added with judgment on the number of body knowledge groups in the body knowledge string, if the number of the body knowledge groups in the body knowledge string is less than a set threshold value, i is increased by 1, a search starting point is updated, a new body knowledge string is obtained again, and the new body knowledge string is sent to the behavior detection module; the setting threshold is set empirically.
In the embodiment of the present invention, the behavior types of the API mainly include file operation, system operation, process/thread operation, registry operation, storage operation, kernel operation, network operation, device operation, window operation, and text operation.
In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The typical attack behavior detection method for the malicious code is characterized by comprising the following steps of:
s1, running malicious codes in a sandbox environment, and extracting a dynamic system call API sequence and an original ontology knowledge sequence from a generated dynamic analysis report;
s2, calculating classification contribution degrees aiming at each API in the dynamic system call API sequence, and sequencing according to the classification contribution degrees from large to small, namely sequencing the malice to obtain malice rankSequence of sequences
Figure FDA0003690106080000011
S3, sequencing the sequence according to the maliciousness
Figure FDA0003690106080000012
Sequentially selecting APIs as search starting points;
s4, finding a position A where the search starting point is located in the original ontology knowledge sequence, respectively performing forward traversal search and backward traversal search starting from the position A in the original ontology knowledge sequence, extracting an ontology knowledge tuple corresponding to the API which belongs to the same behavior type as the search starting point, and forming an ontology knowledge string;
and S5, taking the typical attack behavior of the malicious code represented by the ontology knowledge string as a detection result.
2. The method of claim 1, wherein the dynamic system call API sequence includes all APIs for system calls during the malicious code runtime;
the original ontology knowledge sequence consists of ontology knowledge groups corresponding to each API; each ontology tuple contains the API and its operands.
3. The method of claim 1, wherein the forward traversal search is specifically:
taking an API corresponding to the previous position of the position A, if the behavior type of the API corresponding to the previous position is consistent with the behavior type of the API corresponding to the search starting point, adding an ontology knowledge tuple of the API corresponding to the previous position into a forward ontology knowledge sub-string, updating the position A to be the previous position, and repeatedly executing forward traversal search until the behavior type of the API corresponding to the previous position is inconsistent with the behavior type of the API corresponding to the search starting point;
the backward traversal search specifically includes:
taking the API corresponding to the next position of the position A, if the behavior type of the API corresponding to the next position is consistent with the behavior type of the API corresponding to the search starting point, adding the ontology knowledge tuple of the API corresponding to the next position into the backward ontology knowledge sub-string, updating the position A to be the next position, and repeatedly executing backward traversal search until the behavior type of the API corresponding to the next position is inconsistent with the behavior type of the API corresponding to the search starting point;
and after the forward traversal search and the backward traversal search are finished, combining the obtained forward ontology knowledge substring and the obtained backward ontology knowledge substring into an ontology knowledge string.
4. The method according to claim 1, 2 or 3, wherein in S3, a malicious ith bit API is taken as a search starting point in sequence, and the initial value of i is 1;
in the step S4, judging the number of ontology knowledge groups in the ontology knowledge string is added, if the number of ontology knowledge groups in the ontology knowledge string is smaller than a set threshold value, i is increased by itself by 1, and the step returns to the step S3;
the setting threshold is set empirically.
5. The method of claim 1, 2 or 3, wherein the types of behavior of the API primarily include file operations, system operations, process/thread operations, registry operations, storage operations, kernel operations, network operations, device operations, window operations, and text operations.
6. The typical attack behavior detection system for the malicious code is characterized by comprising a data acquisition module, a data preprocessing module, an ontology knowledge string extraction module and a behavior detection module;
the data acquisition module is used for operating malicious codes in a sandbox environment, extracting a dynamic system call API sequence and an original body knowledge sequence from a generated dynamic analysis report and sending the dynamic system call API sequence and the original body knowledge sequence to the data preprocessing module;
the data preprocessing module is used for calculating the classification contribution degree aiming at each API in the dynamic system call API sequence and calculating the classification contribution degree according to the classification contribution degreeSorting according to the size, namely, sorting according to the maliciousness to obtain a maliciousness sorting sequence
Figure FDA0003690106080000021
Sending the ontology knowledge string to the ontology knowledge string extraction module;
the ontology knowledge string extraction module is used for sequencing the sequence according to the malice
Figure FDA0003690106080000022
Sequentially selecting APIs as search starting points; finding a position A where the search starting point is located in the original ontology knowledge sequence, respectively performing forward traversal search and backward traversal search from the position A in the original ontology knowledge sequence, extracting an ontology knowledge group corresponding to an API (application programming interface) which belongs to one behavior type with the search starting point to form an ontology knowledge string, and sending the ontology knowledge string to the behavior detection module;
and the behavior detection module is used for taking the typical attack behavior of the malicious code represented by the ontology knowledge string as a detection result.
7. The system of claim 6, wherein the dynamic system call API sequence includes all APIs for system calls during the running of the malicious code;
the original ontology knowledge sequence consists of ontology knowledge tuples corresponding to each API; each ontology tuple contains the API and its operands.
8. The system of claim 6, wherein the forward traversal search is specifically:
taking an API corresponding to the previous position of the position A, if the behavior type of the API corresponding to the previous position is consistent with the behavior type of the API corresponding to the search starting point, adding an ontology knowledge tuple of the API corresponding to the previous position into a forward ontology knowledge sub-string, updating the position A to be the previous position, and repeatedly executing forward traversal search until the behavior type of the API corresponding to the previous position is inconsistent with the behavior type of the API corresponding to the search starting point;
the backward traversal search specifically includes:
taking the API corresponding to the next position of the position A, if the behavior type of the API corresponding to the next position is consistent with the behavior type of the API corresponding to the search starting point, adding the ontology knowledge tuple of the API corresponding to the next position into the backward ontology knowledge sub-string, updating the position A to be the next position, and repeatedly executing backward traversal search until the behavior type of the API corresponding to the next position is inconsistent with the behavior type of the API corresponding to the search starting point;
and after the forward traversal search and the backward traversal search are finished, combining the obtained forward ontology knowledge substring and the obtained backward ontology knowledge substring into an ontology knowledge string.
9. The system according to claim 6, 7 or 8, wherein in the ontology string extracting module, malicious ith bit APIs are sequentially taken as search starting points, and the initial value of i is 1;
the ontology knowledge string extraction module is additionally used for judging the number of ontology knowledge groups in the ontology knowledge string, if the number of ontology knowledge groups in the ontology knowledge string is smaller than a set threshold value, i is increased by 1, a search starting point is updated, a new ontology knowledge string is obtained again and sent to the behavior detection module;
the set threshold is set empirically.
10. The system of claim 6, 7 or 8, wherein the types of behavior of the API primarily include file operations, system operations, process/thread operations, registry operations, store operations, kernel operations, network operations, device operations, window operations, and text operations.
CN202010826647.5A 2020-08-17 2020-08-17 Method and system for detecting typical attack behavior of malicious code Active CN112115465B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010826647.5A CN112115465B (en) 2020-08-17 2020-08-17 Method and system for detecting typical attack behavior of malicious code

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010826647.5A CN112115465B (en) 2020-08-17 2020-08-17 Method and system for detecting typical attack behavior of malicious code

Publications (2)

Publication Number Publication Date
CN112115465A CN112115465A (en) 2020-12-22
CN112115465B true CN112115465B (en) 2022-11-04

Family

ID=73804298

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010826647.5A Active CN112115465B (en) 2020-08-17 2020-08-17 Method and system for detecting typical attack behavior of malicious code

Country Status (1)

Country Link
CN (1) CN112115465B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543410A (en) * 2018-11-20 2019-03-29 北京理工大学 One kind being based on the associated malicious code detecting method of Semantic mapping
CN110414234A (en) * 2019-06-28 2019-11-05 奇安信科技集团股份有限公司 Malicious code family identification method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2755780T3 (en) * 2011-09-16 2020-04-23 Veracode Inc Automated behavior and static analysis using an instrumented sandbox and machine learning classification for mobile security

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543410A (en) * 2018-11-20 2019-03-29 北京理工大学 One kind being based on the associated malicious code detecting method of Semantic mapping
CN110414234A (en) * 2019-06-28 2019-11-05 奇安信科技集团股份有限公司 Malicious code family identification method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于属性数据流图的恶意代码家族分类;杨频等;《信息安全研究》;20200305(第03期);全文 *
基于软件基因的Android恶意软件检测与分类;韩金等;《计算机应用研究》;20180408(第06期);全文 *

Also Published As

Publication number Publication date
CN112115465A (en) 2020-12-22

Similar Documents

Publication Publication Date Title
CN109308415B (en) Binary-oriented guidance quality fuzzy test method and system
Rad et al. Opcodes histogram for classifying metamorphic portable executables malware
CN111639337B (en) Unknown malicious code detection method and system for massive Windows software
CN105740712A (en) Android malicious act detection method based on Bayesian network
US20120151586A1 (en) Malware detection using feature analysis
CN112307473A (en) Malicious JavaScript code detection model based on Bi-LSTM network and attention mechanism
CN109740347B (en) Method for identifying and cracking fragile hash function of intelligent device firmware
CN106384050B (en) A kind of dynamic stain analysis method excavated based on Maximum Frequent subgraph
Savenko et al. Metamorphic Viruses' Detection Technique Based on the Equivalent Functional Block Search.
CN109543410B (en) Malicious code detection method based on semantic mapping association
CN109190371A (en) A kind of the Android malware detection method and technology of Behavior-based control figure
CN113297580B (en) Code semantic analysis-based electric power information system safety protection method and device
CN112685738B (en) Malicious confusion script static detection method based on multi-stage voting mechanism
CN109976806B (en) Java statement block clone detection method based on byte code sequence matching
CN105245495A (en) Similarity match based rapid detection method for malicious shellcode
CN111259397A (en) Malware classification method based on Markov graph and deep learning
CN108932430A (en) A kind of malware detection method based on software gene technology
CN102298681B (en) Software identification method based on data stream sliced sheet
CN114386511A (en) Malicious software family classification method based on multi-dimensional feature fusion and model integration
CN113468524B (en) RASP-based machine learning model security detection method
CN112115465B (en) Method and system for detecting typical attack behavior of malicious code
CN108229168B (en) Heuristic detection method, system and storage medium for nested files
Hang et al. Malware detection method of android application based on simplification instructions
WO2018110997A1 (en) Method and apparatus for generating network intrusion detection rule
CN106326746B (en) A kind of rogue program behavioural characteristic base construction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant