KR101726360B1 - Method and server for generating suffix tree, method and server for detecting malicious code with using suffix tree - Google Patents

Method and server for generating suffix tree, method and server for detecting malicious code with using suffix tree Download PDF

Info

Publication number
KR101726360B1
KR101726360B1 KR1020150092118A KR20150092118A KR101726360B1 KR 101726360 B1 KR101726360 B1 KR 101726360B1 KR 1020150092118 A KR1020150092118 A KR 1020150092118A KR 20150092118 A KR20150092118 A KR 20150092118A KR 101726360 B1 KR101726360 B1 KR 101726360B1
Authority
KR
South Korea
Prior art keywords
function call
malicious code
call sequence
suffix tree
unit
Prior art date
Application number
KR1020150092118A
Other languages
Korean (ko)
Other versions
KR20170002115A (en
Inventor
임을규
김태근
박준규
Original Assignee
한양대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한양대학교 산학협력단 filed Critical 한양대학교 산학협력단
Priority to KR1020150092118A priority Critical patent/KR101726360B1/en
Publication of KR20170002115A publication Critical patent/KR20170002115A/en
Application granted granted Critical
Publication of KR101726360B1 publication Critical patent/KR101726360B1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present invention relates to a method for generating a suffix tree, a server tree creation server, a malicious code detection method using the suffix tree, and a malicious code detection server using the suffix tree. According to the present invention, there is provided a method for managing malicious code, comprising: loading a generated suffix tree based on a malicious code sample file; Receiving a sequence of function calls of a target sample file from a client; Processing the received function call sequence; Retrieving a subsequence of the processed function call sequence in a suffix tree; And determining whether the target sample file is a malicious code based on a search result for the suffix tree. According to the present invention, it is possible to detect a PC or a mobile-based malicious code.

Description

TECHNICAL FIELD [0001] The present invention relates to a method and a server for generating a suffix tree, and a malicious code detection method and server using the suffix tree.

The present invention relates to a method for generating a suffix tree and a suffix tree generation server, a malicious code detection method using the suffix tree, and a malicious code detection server using the suffix tree.

Dynamic analysis means executing program to detect maliciousness and extracting and analyzing action information. Signature-based analysis means analysis using characteristics of existing malicious code.

There are many ways to analyze the sequence of function calls. There are many cases where malicious code is judged to be an arbitrary program when the similarity of malicious code with malfunction code is calculated and similarity is more than a certain threshold value.

The algorithm used in the similarity comparison analysis can be different and various.

Most of the time, analysis time overhead is incurred because most users perform all analysis and not use time-efficient algorithms.

The present invention aims at solving all of the above problems.

The present invention has another purpose to utilize in PC or mobile based malware detection.

Another object of the present invention is to utilize it in malicious code analysis and classification.

In order to accomplish the objects of the present invention as described above and achieve the characteristic effects of the present invention described below, the characteristic structure of the present invention is as follows.

According to an embodiment, there is provided a method for generating a suffix tree, the method comprising: executing a malicious code sample file and a normal sample file including malicious code; Extracting a first function call sequence for the malicious code sample file and a second function call sequence for the normal sample file; Processing the extracted first function call sequence and a second function call sequence; And generating a suffix tree using the processed first function call sequence and the second function call sequence.

According to another embodiment, in the method of generating a suffix tree, the step of processing includes combining the first function call sequence and the second function call sequence by merging successively repeated elements in the extracted first function call sequence and the second function call sequence, And processing the second function call sequence.

According to yet another embodiment, the method of generating a suffix tree comprises generating a temporary suffix tree based on the processed first function call sequence and the second function call sequence ; And removing a node corresponding to the subsequence of the first function call sequence and an edge connecting the node in the temporary suffix tree to determine a final suffix tree.

In a malicious code detection method according to an embodiment, a malicious code detection method includes:

Loading a generated suffix tree based on the malicious code sample file; Receiving a sequence of function calls of a target sample file from a client; Processing the received function call sequence; Retrieving a subsequence of the processed function call sequence in a suffix tree; And determining whether the target sample file is malicious code based on the search result for the suffix tree.

In the malicious code detection method according to another embodiment, the suffix tree includes a function calling sequence extracted from a normal sample file, a function calling sequence extracted from a malicious code sample file including malicious code, And a final suffix tree generated by removing nodes for the function call sequence extracted from the normal sample file in the temporary suffix tree derived from the calling sequence.

According to yet another embodiment, in the malicious code detection method, the processing may include processing the function call sequence by merging successively repeated elements in a function call sequence received from the client.

According to another embodiment of the present invention, there is provided a malicious code detection method, comprising the steps of: extracting a subsequence by applying a sliding window to the processed function calling sequence; And determining whether the extracted subsequence matches a sequence derived from nodes and edges of the suffix tree.

In the malicious code detection method according to another embodiment, the step of extracting the subsequence may include extracting a subsequence by overlapping sliding windows according to a predetermined element unit in the function calling sequence have.

According to another embodiment, in the malicious code detection method, the malicious code detection method may include transmitting, when the target sample file is determined to be malicious code, information indicating that malicious code exists in the target sample file to the client .

According to one embodiment, in the suffix tree generation server, the suffix tree generation server includes: a sample file execution unit that executes a malicious code sample file including a malicious code and a normal sample file; A function call sequence extracting unit for extracting a first function call sequence for the malicious code sample file and a second function call sequence for the normal sample file; A function call sequence processing unit for processing the extracted first function call sequence and the second function call sequence; And a suffix tree generation unit for generating a suffix tree using the processed first function call sequence and the second function call sequence.

According to another embodiment, in the suffix tree generation server, the function call sequence processing unit may combine the first function call sequence and the second function call sequence so that the first function call sequence And processing a second function call sequence.

According to another embodiment of the present invention, in the suffix tree generation server, the suffix tree generation unit generates a temporary suffix tree based on the processed first function call sequence and the second function call sequence, And removing the node connecting the node and the node corresponding to the subsequence of the first function call sequence in the fix tree to determine the final suffix tree.

According to one embodiment, the malicious code detection server comprises: a suffix tree loading unit for loading a suffix tree generated based on a malicious code sample file; A function call sequence receiving unit for receiving a function call sequence of a target sample file from a client; A function call sequence processing unit for processing the received function call sequence; A suffix tree search unit for searching a suffix tree for a subsequence of the processed function call sequence; And a malicious code determination unit for determining whether the target sample file is a malicious code based on a search result for the suffix tree.

In the malicious code detection server according to another embodiment, the suffix tree includes a function calling sequence extracted from a normal sample file, a function calling sequence extracted from a malicious code sample file including malicious code, And a final suffix tree generated by removing nodes for the function call sequence extracted from the normal sample file in the temporary suffix tree derived from the calling sequence.

In a malicious code detection server according to another embodiment, the function call sequence processing unit may include processing the function call sequence by merging successively repeated elements in a function call sequence received from the client .

In the malicious code detection server according to another embodiment, the suffix tree searching unit may include a subsequence extracting unit for extracting a subsequence by applying a sliding window to the processed function calling sequence; And a subsequence matching unit for determining whether the extracted subsequence is matched with a sequence derived from nodes and edges of the suffix tree.

In the malicious code detection server according to another embodiment, the subsequence extracting unit may include extracting a subsequence by overlapping sliding windows according to a predetermined element unit in the function calling sequence.

In the malicious code detection server according to another embodiment, the malicious code detection server may include a malicious code for transmitting information indicating that the malicious code exists in the target sample file to the client, And may further include an information transmission unit.

The present invention can be utilized for PC or mobile based malware detection. Therefore, the present invention has an effect that can be utilized in malicious code analysis and classification.

1A is a flowchart illustrating a method for generating a suffix tree according to an embodiment of the present invention.
1B is a flowchart illustrating a method for generating a suffix tree according to an embodiment of the present invention.
2A is a malicious code detection method according to an embodiment of the present invention, which shows a malicious code detection method.
2B is a flowchart illustrating a malicious code detection method according to an embodiment of the present invention.
3A is a conceptual diagram illustrating a malicious code detection method according to an embodiment of the present invention.
FIG. 3B is a flowchart illustrating a malicious code detection method according to an exemplary embodiment of the present invention, which is performed by a client and a server.
FIG. 4A illustrates a flow of an operation procedure of a client according to an exemplary embodiment of the present invention.
FIG. 4B shows a flow of an operation procedure of a server according to an embodiment of the present invention.
FIG. 5A shows a state before a function call sequence according to an embodiment of the present invention.
FIG. 5B shows a functional call sequence processing according to an embodiment of the present invention.
5C shows a sliding window scan according to an embodiment of the present invention.
6 illustrates a process of generating a suffix tree according to an embodiment of the present invention.
FIG. 7A shows a state before the removal of a function call sequence according to an embodiment of the present invention.
7B illustrates a function call sequence according to an embodiment of the present invention.
FIG. 7C illustrates the removal of a function call sequence according to an embodiment of the present invention.
8 is a block diagram illustrating a suffix tree generation server according to an embodiment of the present invention.
9 is a block diagram illustrating a malicious code detection server according to an embodiment of the present invention.
10 is a block diagram illustrating a suffix tree search unit according to an embodiment of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

1A is a flowchart illustrating a method for generating a suffix tree according to an embodiment of the present invention.

Referring to FIG. 1A, a method of generating a suffix tree performed by a suffix tree generation server can be known. Creating a Suffix Tree A method of creating a suffix tree performed by the server may include the following steps.

In step S110, the suffix tree creation server may execute a malicious code sample file including malicious code and a normal sample file. At this time, each of the malicious code sample file and the normal sample file may be plural, but is not limited thereto.

In step S120, the suffix tree generation server may extract a first function call sequence for the malicious code sample file and a second function call sequence for the normal sample file.

In step S130, the suffix tree creation server may process the extracted first function call sequence and the second function call sequence.

Specifically, the suffix tree generation server may process the first function call sequence and the second function call sequence by merging successively repeated elements in the extracted first function call sequence and the second function call sequence.

In step S140, the suffix tree generation server may generate the suffix tree using the processed first function call sequence and the second function call sequence. Specifically, the suffix tree generation server may generate a temporary suffix tree based on the processed first function call sequence and the second function call sequence. The suffix tree generation server may also determine the final suffix tree by removing the nodes connecting the node and node corresponding to the subsequence of the first function call sequence in the temporary suffix tree.

1B is a flowchart illustrating a method for generating a suffix tree according to an embodiment of the present invention.

Referring to FIG. 1B, a method for generating a suffix tree performed by the suffix tree generating server may include the following steps.

According to one embodiment, the suffix tree generation server may generate the suffix tree using the processed first function call sequence and the second function call sequence.

In step S141, the suffix tree generation server may generate a temporary suffix tree based on the processed first function call sequence and the second function call sequence.

In step S142, the suffix tree generation server can determine the final suffix tree by removing the node connecting the node and the node corresponding to the subsequence of the first function call sequence in the temporary suffix tree.

2A is a malicious code detection method according to an embodiment of the present invention, which shows a malicious code detection method.

Referring to FIG. 2A, the malicious code detection method performed by the malicious code detection server may include the following steps.

In step S210, the malicious code detection server may load the generated suffix tree based on the malicious code sample file. At this time, the suffix tree is formed by processing the function call sequence extracted from the malicious code sample file including the function call sequence extracted from the normal sample file and the malicious code sample file and extracting the normal sample file from the temporary suffix tree derived from the processed function call sequence Lt; RTI ID = 0.0 > suffix tree < / RTI >

In step S220, the malicious code detection server may receive a sequence of function calls of the target sample file from the client.

In step S230, the malicious code detection server may process the received function call sequence. Specifically, the malicious code detection server can process a function call sequence by merging successively repeated elements in the function call sequence received from the client.

In step S240, the malicious code detection server may search the suffix tree for the subsequence of the processed function call sequence. Specifically, the malicious code detection server can extract a subsequence by applying a sliding window to the processed function call sequence. At this time, the malicious code detection server can extract the subsequence by sliding the sliding window according to the predetermined element unit in the function calling sequence. In addition, the malicious code detection server may determine whether the extracted subsequence matches a sequence derived at the nodes and edges of the suffix tree. For example, suppose the preset element unit is 4. 10, 30, 32, 54, 23, 54, ... , The malicious code detection server can extract 10, 30, 32, 54 as a subsequence. Next, the malicious code detection server can extract 30, 32, 54, and 23 as subsequences. Of course, depending on the case, the malicious code detection server may be modified by increasing or decreasing the predetermined element unit, but is not limited thereto.

In step S250, the malicious code detection server may determine whether the target sample file is malicious code based on the search result for the suffix tree.

In step S260, if the target sample file is determined to be malicious code, the malicious code detection server may transmit to the client information indicating that the malicious code exists in the target sample file.

2B is a flowchart illustrating a malicious code detection method according to an embodiment of the present invention.

According to one embodiment, the malicious code detection server may retrieve a subsequence of the processed function call sequence from the suffix tree.

In step S241, the malicious code detection server can extract the subsequence by applying a sliding window to the processed function calling sequence.

In step S242, the malicious code detection server may determine whether the extracted subsequence matches a sequence derived from nodes and edges of the suffix tree.

3A is a conceptual diagram illustrating a malicious code detection method according to an embodiment of the present invention.

Referring to FIG. 3A, the entire system includes a client 310 and a malicious code detection server 320.

According to one embodiment, the malicious code detection server 320 may generate a suffix tree to be used as a signature model. In addition, the generated suffix tree can be used to determine the maliciousness of an arbitrary program, and to detect malicious code in real time.

In addition, according to one embodiment, the client 310 may record a function that is called during execution of an arbitrary program and transmit (311) the function call sequence arranged in chronological order to the malicious code detection server 320 have. In addition, the malicious code detection server 320 can search the suffix tree, which is a signature model generated in advance, of the transferred function call sequence. At this time, if the search result is the same as the malicious code function call sequence, any program executed in the client can be determined as malicious code. The malicious code detection server 320 may transmit the malicious code discrimination result to the client 310. [

FIG. 3B is a flowchart illustrating a malicious code detection method according to an exemplary embodiment of the present invention, which is performed by a client and a malicious code detection server.

According to one embodiment, the malware detection framework may be configured with a server-client architecture. At this time, the server may be a malicious code detection server or a suffix tree generation server. In addition, the client may be, but not limited to, a user terminal such as a smart phone or a tablet phone. The client can record (1-1) a function that is called during execution of an arbitrary program. In addition, the client may periodically transmit (a) the sequence of function calls arranged in chronological order to the malicious code detection server. The malicious code detection server loads (2-1) a suffix tree, which is a signature model generated in advance, and searches (2-2, 2-3, 2-4) in the suffix tree in which the transmitted function call sequence is loaded. can do. If the search result is the same as the function call sequence of the malicious code, any program executed on the client can be determined as malicious code. At this time, the malicious code detection server may transmit the search result to the client (b).

FIG. 4A illustrates a flow of an operation procedure of a client according to an exemplary embodiment of the present invention.

Referring to FIG. 4A, the malicious code detection process of the client can be known. The malicious code detection operation of the client may include the following steps. Because clients are more likely to be limited in performance than malware detection servers, they can minimize their work for detection analysis.

According to one embodiment, the client may be responsible for recording the sequence of function calls and periodically sending it to the malicious code detection server.

The malicious code detection operation of the client may include the following steps.

In step S401, the client can execute an arbitrary program and record a function call in an arbitrary program.

In step S402, the client may periodically transmit the function call sequence to the malicious code detection server. Of course, sending a sequence of function calls to a malicious code detection server can happen repeatedly.

4B is a flowchart illustrating an operation of the malicious code detection server according to an embodiment of the present invention.

Referring to FIG. 4B, the malicious code detection process of the malicious code detection server can be known. The malicious code detection server can do more work for the analysis than the client. In addition, the malicious code detection server may load a signature model, a suffix tree, in the memory, which represents a function calling sequence of a known malicious code, prior to analyzing the function calling sequence transmitted from the client. In addition, the malicious code detection server can process the function call sequence when the function call sequence is transmitted from the client.

The malicious code detection operation of the malicious code detection server may include the following steps.

In step S411, the malicious code detection server may load the suffix tree as a signature model. Further, in step S412, the malicious code detection server may receive the function calling sequence from the client.

In step S413, the malicious code detection server may process the received function call sequence. Further, in step S414, the malicious code detection server can create a sliding window. Further, in step S415, the malicious code detection server can search the processed function calling sequence using the sliding window.

In step S416, the malicious code detection server may transmit the retrieved result to the client.

FIG. 5A shows a state before a function call sequence according to an embodiment of the present invention.

Referring to FIG. 5A, a process of processing a function call sequence can be known.

According to one embodiment, when each element constituting the function call sequence 501 transmitted from the client appears continuously, the malicious code detection server may merge each element into one. The malware detection server can perform processing to reduce the length of the function call sequence. For example, if 20 and 20 are consecutive (510), the malicious code detection server may merge 20 into one. Also, if 40 and 40 are repeated (520), the malicious code detection server may merge 40 into one.

FIG. 5B shows a functional call sequence processing according to an embodiment of the present invention.

Referring to FIG. 5B, it can be seen that a function call sequence processing process is performed.

According to one embodiment, upon completion of the function call sequence, the malicious code detection server may scan the processed function call sequence 502 into a fixed length sliding window.

The malicious code detection server can also search the suffix tree for the sliding window subsequence being scanned.

5C shows a sliding window scan according to an embodiment of the present invention.

Referring to FIG. 5C, a subsequence search using a sliding window extraction is known.

According to one embodiment, the malicious code detection server may determine that the subsequence 530 applying the sliding window in the processed function call sequence 502, if it is searched, It is also possible to determine that there is a code and notify the client of the result of the determination.

According to one embodiment, the malicious code detection server can use a signature tree, a suffix tree 503, generated in advance during the detection process.

6 illustrates a process of generating a suffix tree according to an embodiment of the present invention.

Referring to FIG. 6, a process of creating a suffix tree by the suffix tree generating server can be known. The suffix tree generation process performed by the server generating server tree may include the following steps.

In step S610, the suffix tree generation server can execute the programs of the malicious code database 601 and the normal file database 602. [ Of course, the program may be plural. At this time, the program of the malicious code database 601 may be a malicious code sample file, and the program of the normal file database 602 may be a normal sample file. In addition, the malicious code sample file or the normal sample file may be plural.

In step S620, the suffix tree generation server may extract the function call sequence. Further, in step S630, the suffix tree creation server may process the extracted function call sequence.

In step S640, the suffix tree generation server may extract the suffix tree and add the processed sequence. Further, in step S650, the suffix tree generation server may remove the function call sequence of the normal sample file.

According to one embodiment, the suffix tree generation server may execute both the known malicious code sample files and the normal sample files in order. Also, the suffix tree generation server can extract each function call sequence while executing the sample file. At this time, the suffix tree generation server may merge the repeated elements of the function call sequences preferentially as a function call sequence processing step. The suffix tree generation server can generate a suffix tree by using the processed function call sequence after the malicious code sample file and the function call sequence of the normal sample file are processed.

According to one embodiment, the suffix tree generation server searches for the function call subsequence of the normal sample file while traversing each node appearing in the fix tree when all malicious code sample files and function call sequences of normal sample files are added The node and the edge that represent it can be removed. At this time, the suffix may be a function call subsequence. The server generates the suffix tree because the malicious code may perform actions similar to normal programs. Also, if a function call sequence denoting this behavior is represented in a suffix tree, a normal file can be classified as malicious. The server tree creation server may remove the sequence of function calls found in normal sample files to prevent normal files from being classified as malicious.

FIG. 7A shows a state before the removal of a function call sequence according to an embodiment of the present invention.

Referring to FIG. 7A, a process for removing a function call sequence of a normal sample file can be seen. It is also possible to know the state before the function call sequence of the normal sample file is removed. A suffix tree can be a structure that represents a suffix contained in several sequences. The edges that make up the tree may be labeled with a subsequence used for node-to-node transitions. In addition, each node can indicate which sequence among the sequences inputted when the trace connecting the subsequence for moving from the root node to the current node is generated when generating the suffix tree.

7B illustrates a function call sequence according to an embodiment of the present invention.

Referring to FIG. 7B, a procedure for removing a function call sequence of a normal sample file can be seen.

For example, when tx = {2,3,4,5} is input into the suffix tree shown in Fig. 7A, and the edge labeled {2,3,4} and the edge labeled {5} , And the label of the lowest node is mt.1, so tx is a subsequence of mt.1. Looking at each node label, we can see which input sequence is the subsequence of which sequence was used at the time of creation. It is also possible to know whether it is a subsequence of a malicious code sample file or a subsequence of a normal sample file.

According to one embodiment, the suffix tree generation server traverses the node to remove the sequence of normal sample files, identifies the label, and removes the node and its associated edge if a normal sample file of labels is included have.

FIG. 7C illustrates the removal of a function call sequence according to an embodiment of the present invention.

Referring to FIG. 7C, the normal sequence is removed from the suffix tree generated by two malicious sequences (mt.1, mt.2) and a normal sequence (bt.1, bt.2). As a result, we can see that all nodes and edges associated with all suffixes included in bt.1 and bt.2 have been removed. At this time, the suffix tree generation server can use the DFS search algorithm as the node traversal used in the removal.

8 is a block diagram illustrating a suffix tree generation server according to an embodiment of the present invention.

8, the suffix tree generation server 800 includes a sample file execution unit 810, a function call sequence extraction unit 820, a function call sequence processing unit 830, a suffix tree generation unit 840, As shown in FIG. The suffix tree generation server 800 may be, but is not limited to, a computing device comprising at least one of a processor, a memory, and a data transceiver. The sample file execution unit 810, the function call sequence extraction unit 820, the function call sequence processing unit 830 and the suffix tree generation unit 840 may be implemented by a processor, a memory, an electronic circuit, an electric circuit, , An electronic device, a magnetic device, and a data transceiver, but the present invention is not limited thereto.

The sample file execution unit 810 can execute the malicious code sample file and the normal sample file including the malicious code.

The function call sequence extracting unit 820 can extract a first function calling sequence for a malicious code sample file and a second function calling sequence for a normal sample file.

The function call sequence processing unit 830 can process the extracted first function call sequence and the second function call sequence. Specifically, the function call sequence processing unit 830 can process the first function call sequence and the second function call sequence by merging the successively repeated elements in the extracted first function call sequence and the second function call sequence have.

The suffix tree generating unit 840 can generate the suffix tree using the processed first function call sequence and the second function call sequence. Specifically, the suffix tree generation unit 840 generates a temporary suffix tree based on the processed first function call sequence and the second function call sequence, and generates a subsequence of the first function call sequence in the temporary suffix tree Lt; RTI ID = 0.0 > suffix tree < / RTI > can be determined.

9 is a block diagram illustrating a malicious code detection server according to an embodiment of the present invention.

9, the malicious code detection server 900 includes a suffix tree loading unit 910, a function calling sequence receiving unit 920, a function calling sequence processing unit 930, a suffix tree searching unit 940, A malicious code determining unit 950, and a malicious code information transmitting unit 960. At this time, the malicious code detection server 900 may be a computing device including, but not limited to, a processor, a memory, and a data transceiver. A function call sequence receiving unit 920, a function calling sequence processing unit 930, a suffix tree searching unit 940, a malicious code determining unit 950, a malicious code information transmitting unit 930, But not limited to, a processor, a memory, an electronic circuit, an electrical circuit, an integrated circuit, an electronic device, a magnetic device, and a data transceiver.

The suffix tree loading unit 910 can load the generated suffix tree based on the malicious code sample file.

The suffix tree processes the function call sequence extracted from the malicious code sample file containing the function call sequence and the malicious code extracted from the normal sample file and extracts the function call sequence from the normal sample file in the temporary suffix tree derived from the processed function call sequence May be the final suffix tree generated by removing the node for the extracted function call sequence.

The function call sequence receiving unit 920 can receive the function call sequence of the target sample file from the client.

The function call sequence processing unit 930 can process the received function call sequence.

The function call sequence processing unit 930 can process the function call sequence by merging successively repeated elements in the function call sequence received from the client.

The search tree searching unit 940 can search the suffix tree for the subsequence of the processed function call sequence.

The malicious code determining unit 950 can determine whether the target sample file is malicious code based on the search result for the suffix tree.

The malicious code information transmitting unit 960 can transmit to the client information indicating that a malicious code exists in the target sample file when the target sample file is determined to be malicious code.

10 is a block diagram illustrating a suffix tree search unit according to an embodiment of the present invention.

10, the suffix tree searching unit 1000 may include a subsequence extracting unit 1010 and a subsequence matching unit 1120. The suffix tree searching unit 1000 may be configured to include at least one of a processor, a memory, an electronic circuit, an electric circuit, an integrated circuit, an electronic element, a magnetic element, and a data transceiver, but is not limited thereto. The subsequence extracting unit 1010 and the subsequence matching unit 1120 may include at least one of a processor, a memory, an electronic circuit, an electric circuit, an integrated circuit, an electronic device, a magnetic device, and a data transceiver , But is not limited thereto.

The subsequence extracting unit 1010 can extract a subsequence by applying a sliding window to the processed function calling sequence.

The subsequence extracting unit 1010 can extract the subsequence by sliding the sliding window according to a predetermined element unit in the function calling sequence.

The subsequence matching unit 1120 can determine whether or not the extracted subsequence matches a sequence derived from nodes and edges of the suffix tree.

According to one embodiment, the suffix tree generation server 800 may, in some cases, be the malicious code detection server 900. As a matter of course, the malicious code detection server 900 may be the suffix tree generation server 800. [ In addition, the malicious code detection server 900 may be configured as a malicious code detection framework that utilizes a suffix tree, but is not limited thereto. The malicious code detection server 900 may be a malicious code behavior detection framework using a structure called a suffix tree. A suffix tree is a structure that effectively stores multiple strings in a linear time search. In addition, a framework for detecting malicious code can be used to perform a task of finding an arbitrary sequence in a function call sequence of a plurality of known malicious codes.

According to one embodiment, the malicious code detection server 900 can perform a suffix tree generation and real-time detection as a malicious code detection framework. For example, in order to generate a suffix tree, the malicious code detection server 900 utilizes a malicious code sample file including known malicious codes and a function calling sequence extracted from normal sample files, which are normal programs, You can model a suffix tree that represents a sequence of function calls found only in your code. In addition, the malicious code detection server 900 may perform a function call that is generated by the target sample file, which is an arbitrary program in the stealth tree, while being executed for discriminating maliciousness of the target sample file, You can explore the sequence.

According to one embodiment, malicious code detection server 900 may be a dynamic signature based malware detection system. In addition, it can be configured as a server-client, thereby reducing the overhead that may occur in the user equipment. As a server and a client, it is possible to divide and carry out tasks assigned to each user device. For example, the client can take the role of extracting the function call sequence of the target sample file, which is an arbitrary program, and transmitting it to the malicious code detection server 900. Also, the malicious code detection server 900 can compare the function call sequence transmitted from the client with the existing malicious code dynamic signatures. The malicious code detection server 900 can perform a very fast in-time analysis because it uses a search tree search algorithm when performing signature comparison.

According to one embodiment, the suffix tree generation method and the malicious code detection method performed by the malicious code detection server 900 can be classified into a PC-based malicious code detection system (Anti Virus system), a mobile malicious code detection system, It can be applied when it is judged whether or not the target sample file, which is an arbitrary program that has flown into the apparatus, is malicious.

In addition, the malicious code detection server 900 can perform PC or mobile anti-virus software, a method of generating a suffix tree, and a malicious code detection method to discriminate maliciousness of a target sample file, which is an arbitrary program that has flowed into a user's PC.

According to one embodiment, the suffix tree generation method and the malicious code detection method can be applied to a general user PC and a mobile environment for determining whether a malicious program exists or not.

In addition, the suffix tree generation method and malicious code detection method can be utilized for PC or mobile based malicious code detection, and furthermore, malicious code analysis and classification can be utilized.

The methods according to embodiments of the present invention may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be those specially designed and configured for the present invention or may be available to those skilled in the art of computer software.

While the invention has been shown and described with reference to certain preferred embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. This is possible.

Therefore, the scope of the present invention should not be limited by the illustrated embodiments, but should be determined by the equivalents of the claims, as well as the claims that follow.

Claims (18)

A method for generating a suffix tree, which is performed by a suffix tree generation server including a sample file execution unit, a function call sequence extraction unit, a function call sequence processing unit, and a suffix tree generation unit,
Executing the malicious code sample file and the normal sample file including the malicious code;
The function call sequence extracting unit extracting a first function call sequence for the malicious code sample file and a second function call sequence for the normal sample file;
The function call sequence processing unit processing the extracted first function call sequence and the second function call sequence; And
Wherein the suffix tree generating unit generates the suffix tree using the processed first function call sequence and the second function call sequence
/ RTI >
The method according to claim 1,
The processing step comprises:
Wherein the function call sequence processing unit processes the first function call sequence and the second function call sequence by merging successively repeated elements in the extracted first function call sequence and the second function call sequence.
The method according to claim 1,
Wherein the generating the suffix tree comprises:
The suffix tree generating unit generating a temporary suffix tree based on the processed first function call sequence and the second function call sequence;
Wherein the suffix tree generator removes a node corresponding to a subsequence of a first function call sequence and an edge connecting the node in the temporary suffix tree to determine a final suffix tree
/ RTI >
A malicious code detection method performed by a malicious code detection server including a suffix tree loading unit, a function calling sequence receiving unit, a function calling sequence processing unit, a suffix tree searching unit, and a malicious code determining unit,
Loading the generated suffix tree based on the malicious code sample file;
The function call sequence receiving unit receiving a function call sequence of a target sample file from a client;
Processing the received function calling sequence by the function calling sequence processing unit;
Searching the suffix tree for a subsequence of the processed function call sequence; And
Wherein the malicious code determination unit determines whether the target sample file is a malicious code based on a search result for the suffix tree
A malicious code detection method.
5. The method of claim 4,
The suffix tree includes:
A function call sequence extracted from a normal sample file and a function call sequence extracted from a malicious code sample file containing malicious code are processed,
A malicious code detection method, which is a final suffix tree generated by removing nodes for a function call sequence extracted from a normal sample file in a temporary suffix tree derived from a processed function call sequence.
5. The method of claim 4,
The processing step comprises:
Wherein the function call sequence processing unit processes the function call sequence by merging successively repeated elements in a function call sequence received from the client.
5. The method of claim 4,
Wherein the suffix tree searching unit includes a subsequence extracting unit and a subsequence matching unit,
Wherein the searching comprises:
Extracting a subsequence by applying a sliding window to the processed function call sequence;
Wherein the subsequence matching unit determines whether the extracted subsequence matches a sequence derived as a node and an edge of a suffix tree
A malicious code detection method.
8. The method of claim 7,
Wherein the extracting the subsequence comprises:
Wherein the subsequence extracting unit extracts a subsequence by sliding the sliding window according to a predetermined element unit in the function calling sequence.
5. The method of claim 4,
The malicious code detection server may further include a malicious code information transmitting unit,
When the malicious code information transmitting unit determines that the target sample file is a malicious code, transmitting information indicating that the malicious code exists in the target sample file to the client
The malicious code detection method further comprising:
In a server tree creation server,
A malicious code sample file containing malicious code and a sample file execution section executing normal sample file;
A function call sequence extracting unit for extracting a first function call sequence for the malicious code sample file and a second function call sequence for the normal sample file;
A function call sequence processing unit for processing the extracted first function call sequence and the second function call sequence; And
A suffix tree generating unit for generating a suffix tree using the processed first function call sequence and the second function call sequence,
A suffix tree generation server comprising
11. The method of claim 10,
The function call sequence processing unit,
And processing the first function call sequence and the second function call sequence by merging successively repeated elements in the extracted first function call sequence and the second function call sequence.
11. The method of claim 10,
Wherein the suffix tree generating unit comprises:
Generating a temporary suffix tree based on the processed first function call sequence and a second function call sequence and generating a node corresponding to a subsequence of the first function call sequence in the temporary suffix tree and an edge connecting the node To determine the final suffix tree.
In a malicious code detection server,
A suffix tree loading unit for loading the generated suffix tree based on the malicious code sample file;
A function call sequence receiving unit for receiving a function call sequence of a target sample file from a client;
A function call sequence processing unit for processing the received function call sequence;
A suffix tree search unit for searching a suffix tree for a subsequence of the processed function call sequence; And
A malicious code determination unit for determining whether the target sample file is a malicious code based on a search result for the suffix tree;
A malicious code detection server.
14. The method of claim 13,
The suffix tree includes:
A function call sequence extracted from a normal sample file and a function call sequence extracted from a malicious code sample file containing malicious code are processed,
A malicious code detection server, which is a final suffix tree generated by removing nodes for a function call sequence extracted from a normal sample file from a temporary suffix tree derived from a processed function call sequence.
14. The method of claim 13,
The function call sequence processing unit,
And processing the function call sequence by merging successively repeated elements in a function call sequence received from the client.
14. The method of claim 13,
The suffix tree search unit searches the suffix tree,
A subsequence extracting unit for extracting a subsequence by applying a sliding window to the processed function calling sequence;
A subsequence matching unit for determining whether the extracted subsequence matches a sequence derived from a node and an edge of a suffix tree,
A malicious code detection server.
17. The method of claim 16,
Wherein the subsequence extracting unit comprises:
Wherein the sliding window overlaps and moves according to a predetermined element unit in the function calling sequence, thereby extracting the subsequence.
14. The method of claim 13,
A malicious code information transmitting unit for transmitting to the client information indicating that a malicious code exists in the target sample file when the target sample file is determined to be a malicious code;
Further comprising a malicious code detection server.
KR1020150092118A 2015-06-29 2015-06-29 Method and server for generating suffix tree, method and server for detecting malicious code with using suffix tree KR101726360B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020150092118A KR101726360B1 (en) 2015-06-29 2015-06-29 Method and server for generating suffix tree, method and server for detecting malicious code with using suffix tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020150092118A KR101726360B1 (en) 2015-06-29 2015-06-29 Method and server for generating suffix tree, method and server for detecting malicious code with using suffix tree

Publications (2)

Publication Number Publication Date
KR20170002115A KR20170002115A (en) 2017-01-06
KR101726360B1 true KR101726360B1 (en) 2017-04-13

Family

ID=57832240

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020150092118A KR101726360B1 (en) 2015-06-29 2015-06-29 Method and server for generating suffix tree, method and server for detecting malicious code with using suffix tree

Country Status (1)

Country Link
KR (1) KR101726360B1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101472321B1 (en) * 2013-06-11 2014-12-12 고려대학교 산학협력단 Malignant code detect method and system for application in the mobile

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101091204B1 (en) * 2010-02-26 2011-12-09 인하대학교 산학협력단 A method for intrusion detection by pattern search
KR101230271B1 (en) * 2010-12-24 2013-02-06 고려대학교 산학협력단 System and method for detecting malicious code
KR101329037B1 (en) * 2011-12-21 2013-11-14 한국인터넷진흥원 System and method for detecting variety malicious code

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101472321B1 (en) * 2013-06-11 2014-12-12 고려대학교 산학협력단 Malignant code detect method and system for application in the mobile

Also Published As

Publication number Publication date
KR20170002115A (en) 2017-01-06

Similar Documents

Publication Publication Date Title
EP3258409B1 (en) Device for detecting terminal infected by malware, system for detecting terminal infected by malware, method for detecting terminal infected by malware, and program for detecting terminal infected by malware
CN107251037B (en) Blacklist generation device, blacklist generation system, blacklist generation method, and recording medium
CN109586282B (en) Power grid unknown threat detection system and method
US11470097B2 (en) Profile generation device, attack detection device, profile generation method, and profile generation computer program
CN112041815A (en) Malware detection
US11270001B2 (en) Classification apparatus, classification method, and classification program
WO2017012241A1 (en) File inspection method, device, apparatus and non-volatile computer storage medium
JP6711000B2 (en) Information processing apparatus, virus detection method, and program
EP3905084A1 (en) Method and device for detecting malware
US9992216B2 (en) Identifying malicious executables by analyzing proxy logs
CN111049786A (en) Network attack detection method, device, equipment and storage medium
KR20180079434A (en) Virus database acquisition methods and devices, equipment, servers and systems
CN110543765A (en) malicious software detection method
CN108182363B (en) Detection method, system and storage medium of embedded office document
CN109145589B (en) Application program acquisition method and device
KR102318991B1 (en) Method and device for detecting malware based on similarity
CN108229168B (en) Heuristic detection method, system and storage medium for nested files
KR101907681B1 (en) Method, apparatus, and system for automatically generating rule for detecting virus code, and computer readable recording medium for reciring the same
CN113378161A (en) Security detection method, device, equipment and storage medium
CN112395603B (en) Vulnerability attack identification method and device based on instruction execution sequence characteristics and computer equipment
EP3146460B1 (en) Identifying suspected malware files and sites based on presence in known malicious environment
KR101726360B1 (en) Method and server for generating suffix tree, method and server for detecting malicious code with using suffix tree
CN116015861A (en) Data detection method and device, electronic equipment and storage medium
CN107229865B (en) Method and device for analyzing Webshell intrusion reason
US20190156024A1 (en) Method and apparatus for automatically classifying malignant code on basis of malignant behavior information

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
GRNT Written decision to grant