CN102411687B - Deep learning detection method of unknown malicious codes - Google Patents

Deep learning detection method of unknown malicious codes Download PDF

Info

Publication number
CN102411687B
CN102411687B CN201110373558.0A CN201110373558A CN102411687B CN 102411687 B CN102411687 B CN 102411687B CN 201110373558 A CN201110373558 A CN 201110373558A CN 102411687 B CN102411687 B CN 102411687B
Authority
CN
China
Prior art keywords
node
pond
input
space
htm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201110373558.0A
Other languages
Chinese (zh)
Other versions
CN102411687A (en
Inventor
李元诚
樊庆君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China Electric Power University
Original Assignee
North China Electric Power University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China Electric Power University filed Critical North China Electric Power University
Priority to CN201110373558.0A priority Critical patent/CN102411687B/en
Publication of CN102411687A publication Critical patent/CN102411687A/en
Application granted granted Critical
Publication of CN102411687B publication Critical patent/CN102411687B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a deep learning detection method of unknown malicious codes, belonging to the technical field of information security. The deep learning detection method of unknown malicious codes comprises the following steps of: firstly, extracting characteristic vectors of documents in a training set by using byte level n-gram; secondly, constituting an HTM (Hypertext Markup Language) network structure and determining the input data length of each node at the bottom layer of the HTM structure; thirdly, carrying out sequence pattern learning practice and classification derivation with an HTM algorithm by using the characteristic vector as input; fourthly, extracting characteristic vectors of documents in a testing set by using byte level n-gram; fifthly, inputting the characteristic vectors into an HTM network with finished practice for sequence identification, so as to determine whether the test centralized documents contain malicious codes or not. The invention has the beneficial effects of relatively high noise resistance and fault-tolerant ability, and strong adaptability. Simultaneously, the deep learning detection method disclosed by the invention has the advantages of improving the identification ability and identification rate of malicious code detection and realizing accurate detection of new targets of malicious codes.

Description

The degree of deep learning detection method of unknown malicious code
Technical field
The invention belongs to field of information security technology, relate in particular to the degree of deep learning detection method of unknown malicious code.
Background technology
Development along with computer technology and network technology, computing machine has become instrument indispensable in people's daily life, in order to obtain economy, political interest or to carry out individual's revenge, a large amount of tissues or individual use various malicious codes to carry out unlawful activities, the thing followed is that all kinds of malicious codes emerge in an endless stream, the technology that malicious code adopts is also more and more advanced, and it is propagated, harm, the ability such as hide constantly strengthen.Although the detection technique of various malicious codes is also in continuous development, but the detection technique of malicious code and the development that detectability still lags behind malicious code, particularly proposed huge challenge to the detectability of unknown malicious code to malicious code detection technique at present.
At present computer malevolence code detection technique mainly contains two kinds, and a kind of is mode-matching technique based on condition code, and another kind is the detection technique based on malicious code rule of conduct.
Mode-matching technique based on condition code is that the feature code of detected file is mated with the malicious code feature string in property data base, in the detected file of the match is successful interval scale, containing malicious code, otherwise think that detected file is containing malicious code.This Technology Need technician finds and obtains malicious code sample the very first time, and can extract the unique identification condition code of malicious code.Need in addition in time signature update in malicious code condition code storehouse, to detect before this malicious code wide-scale distribution and outburst.This detection technique is not suitable for introducing the malicious code detection of polymorphic and deformation technology, and propagates detection rapid, malicious code that destructive power is strong, with strong points.Detection technique based on malicious code rule of conduct is to carry out detection of malicious code according to the common rule of conduct of the predefined malicious code of expert.This technology cardinal principle is that the operation action of malicious code is often followed the behaviors such as user right change, Registry Modifications, open, the abnormal network service of the network port, or certain particular system sequence of operation.There is serious hysteresis quality defect in this technology, particularly, along with the significantly lifting of computer run speed, while malicious code behavior by the time being detected, often to system, has brought irreparable damage.Above-mentioned two kinds of detection techniques are all a kind of detection techniques afterwards, known malicious code can only be detected, or just can be detected after malicious code is performed, yet malicious code has caused destruction during this period.
Summary of the invention
The present invention is directed to the degree of deep learning detection method that above-mentioned defect discloses unknown malicious code, it comprises the following steps:
1) utilize byte level n-gram to extract the proper vector of training centralized documentation;
2) build the HTM network model of a multilayer, and by intercepting document characteristic vector method, determine the input data length of each node of bottom in HTM structure;
3) using the document characteristic vector that intercepts as the input of HTM network, after all nodes study of bottom, through the successively output in derivation stage, connect, complete the learning training of all node layers of HTM network;
4) utilize byte level n-gram to extract the proper vector of the file in test set;
5) the HTM network that proper vector has been input to training carries out recognition sequence, to determine whether the file in test set contains malicious code.
Described step 2) specifically comprise the following steps:
21) select the HTM network model of a F layer, outer each node of definite division bottom has M child node;
22) utilize formula l=L/M (F-1) intercepting document characteristic vector, and using the document characteristic vector of intercepting successively as the input sample of each node of HTM network bottom layer, wherein F is the number of plies of HTM network structure, M is the quantity of the child node except bottom exterior node, L is the document characteristic vector length of utilizing byte level n-gram method to extract, and l is the input sample length of each node of HTM network bottom layer.
Described step 3) specifically comprises the following steps:
31) using the document characteristic vector that intercepts as the input of HTM network, bottom layer node enters learning phase, until the pond, space of all nodes of bottom all completes study, all study of deadline grouping of transient state pond of spatial model;
32) bottom layer node is through step 31) after the learning stage finishes, bottom layer node enters the derivation stage, new sample input is after bottom layer node is derived, export to the father node that it is positioned at one deck under HTM network, there is the output of lower level node of identical father node after connecting, become the input of next node layer learning phase, next node layer enters learning phase repeating step 31) in the learning process of node;
Step 33) repeating step 32) process, until the node of all layer of HTM network has all completed the learning training of sequence pattern.
Described step 31) specifically comprise the following steps:
311) binary sequence that is input to node is input to pond, space, and the cluster of these sequences is learnt in pond, space with ultimate range parameter D; Pond, space is used the method for ultimate range D to store the subset of input pattern, is called cluster centre; Along with the increase of time, the quantity of the new sequence pattern that pond, space produces within the unit interval can reduce, when the quantity of the new cluster centre of each time cycle lower than set threshold value time, cluster process will stop;
312) transient state pond is exported to the sequence pattern of having learnt in pond, space, divides into groups to sequence pattern according to the time adjacency of sequence pattern in transient state pond, until after all sequence patterns are all grouped, grouping is calculated and finished.
Described step 32) specifically comprise the following steps:
321) utilize formula
Figure GDA0000393608130000041
calculate list entries e -spatial model c based on node space pond iprobability distribution, after Regularization as the output in pond, space, wherein represent the spatial model of non-zero, M is the quantity of the child node of this node, e -for the list entries to be identified from bottom;
322) the output y based on pond, space, utilizes formula
Figure GDA0000393608130000043
calculate the output in transient state pond, wherein, N cfor vector length and the space pool space number of modes of y, λ length is N g.
Beneficial effect of the present invention is: introduce HTM algorithm, imitate the structure of mankind's neopallium and the novel artificial of principle of work intelligence degree of deep learning algorithm, adopt hierarchical tree network structure and the information between node of applying in Bayesian network continues to share principle and degree of belief transfer principle, challenge is converted into pattern match and prediction.And, input data are not needed to carry out complicated pre-service, there is stronger anti-noise, fault-tolerant ability, strong adaptability., in the process that old model is derived, can learn new input pattern meanwhile, improve recognition capability and discrimination that malicious code detects, realize the target of the emerging malicious code of accurate detection.
Accompanying drawing explanation
Fig. 1 is the process schematic diagram of the detection method of unknown malicious code;
Fig. 2 a is HTM level construction tree model schematic diagram;
Fig. 2 b is pond, space and the transient state pond schematic diagram of node K;
Fig. 3 is the learning process schematic diagram of a node of HTM Algorithm for Training process;
Fig. 4 is that the degree of belief of node k in HTM algorithmic derivation process is transmitted computational details schematic diagram.
Embodiment
Below in conjunction with accompanying drawing, preferred embodiment is elaborated.Should be emphasized that, following explanation is only exemplary, rather than in order to limit the scope of the invention and to apply.
The thinking that the present invention deals with problems is: the file set that contains malicious code of take is training sample, adopt byte level n-gram to carry out feature selecting to training set file, thereby the corresponding proper vector of each file, proper vector is trained HTM network as the input of HTM algorithm.Whether finally unknown file is carried out to feature selecting and produce characteristic of correspondence vector, as the input that completes the HTM network of training, it is carried out to pattern-recognition, be the file that comprises malicious code thereby tell it.
As shown in Figure 1, the intelligent detecting method of unknown malicious code comprises the steps:
1) utilize byte level n-gram to extract the proper vector of training centralized documentation.
Can download to the standard data set that is used for specially carrying out malicious code detection on the net, to concentrate select File to construct training set according to ad hoc rules from normal data, such as constructing training set according to malicious code kind select File.
Byte level n-gram is to adopt the moving window of a n byte-sized to get word to binary word throttling or text, and each word is n byte-sized.Such as the content of a text is " abcdef ", its 2-grams sequence is so: ab, bc, cd, de, ef, its 3-grams sequence is: abc, bcd, cde, def.
The content of a file of take is that " abcd " is example, this document is extracted to 2-grams sequence be: ab, bc, cd, so just say that this file has three attributes, the vector that can utilize these three attributes to form represents this file, vector is: { ab, bc, cd}.
Each attribute is quantized, can obtain the proper vector of this document.With above-mentioned vector, { cd} is example for ab, bc, and it is 2 that a is set to 1, b at alphabet meta, c is that 3, d is 4, with position and rule quantize, so, the quantized result of ab is 3, the quantized result 5 of bc, the quantized result of cd is 7, and vector { 3,5,7} is the proper vector of this document.
2) build the HTM network model of a multilayer, and by intercepting document characteristic vector method, determine the input data length of each node of bottom in HTM structure.Be 3 layers of tree structure model as shown in Figure 2 a, except bottom layer node, each node has two child nodes.Be the cut-away view of individual node k in HTM structure as shown in Figure 2 b, its have living space pond and transient state pond forms.Step 2) specifically comprise the following steps:
21) select the HTM network structure of a F layer, outer each node of definite division bottom has M child node.
As shown in Figure 2 a, select F=3, M=2, this HTM network L3 layer, L2 layer, L1 layer have respectively 1,2,4 node.
22) utilize formula l=L/M (F-1) intercepting document characteristic vector, and using the document characteristic vector of intercepting successively as the input sample of each node of HTM network bottom layer, wherein F is the number of plies of HTM network structure, M is the quantity of the child node except bottom exterior node, L is the document characteristic vector length of utilizing byte level n-gram method to extract, and l is the input sample length of each node of HTM network bottom layer.
Suppose document characteristic vector for the length L of 1,2,3,4,5,6,7,8} is 8, l=2, the input sample of each node of HTM network bottom layer is respectively { 1,2}, { 3,4}, { 5,6}, { 7,8}.
3) using the document characteristic vector that intercepts as the input of HTM network, after all nodes study of bottom, through the successively output in derivation stage, connect, complete the learning training of all node layers of HTM network.
The learning training process in this stage is for what successively complete, and after the study of bottom completes, when new input arrives, the node of bottom enters the derivation stage, and the Output rusults of derivation is as the input of next node layer learning phase; For individual node, be also after node space pond sequence pattern has been trained, transient state pond just starts to carry out time grouping.
Step 3) specifically comprises the following steps:
31) as shown in Figure 3, using the document characteristic vector that intercepts as the input of HTM network, bottom layer node enters learning phase, until the pond, space of all nodes of bottom all completes study, all study of deadline grouping of transient state pond of spatial model.
Specifically, step 31) specifically comprise the following steps: again
311) the binary sequence pattern of the document characteristic vector of intercepting is input in the pond, space of bottom node, the cluster of these sequences is learnt in pond, space with ultimate range D.Pond, space is used the method for ultimate range D to store the subset of the binary sequence pattern of input, is called cluster centre.Along with the increase of time, the quantity of the new sequence pattern that pond, space produces within the unit interval can reduce, when the quantity of the new cluster centre of a unit interval cycle T lower than set threshold value time, cluster process will stop.Period of time T is optional is not 0 arbitrary value, and threshold value is non-zero integer.In order to improve learning efficiency, period of time T and threshold value are generally got a less value (such as period of time T is got 5s, threshold value gets 1).
The implication of D is to assert that a binary sequence pattern is different from the minimum euclidean distance of already present cluster centre.For each input binary sequence pattern, all to check that the cluster centre whether existing within Euclidean distance D (is divided into two kinds of situations: if existed, maintain the statusquo; If there is no, this new binary sequence pattern is added in cluster centre list).
Euclidean distance algorithm is as follows: establish x, y ∈ R n, x, the Euclidean distance of y is:
( Σ i = 1 N ( x i - y i ) 2 ) 1 2
312) transient state pond is exported to the binary sequence pattern of having learnt in pond, space, divides into groups to sequence pattern according to the time adjacency of sequence pattern in transient state pond, until after all sequence patterns are all grouped, grouping is calculated and finished.
Step 312) specifically comprise the following steps:
3121), when the input of transient state Chi Jieshoukongjianchi, binary sequence pattern, by rise time time correlation adjacency matrix, after the time, adjacency matrix formed, must be cut apart in groups.In HTM, adopt Greedy algorithm to realize time grouping.
3122) find the maximum not being included in grouping to connect cluster point.The maximum cluster point that connects is only that its corresponding row in time connection matrix has maximum and cluster value.
3123) select step 3122) the middle maximum front N that connects cluster point top(N topdesignated parameter) individual maximum neighbours' cluster point that connects, transient state pond adds these cluster points of selecting in current group.
3124) each is newly added to the cluster point X of grouping, repeating step 3123).All immediate N as X topindividual neighbours' cluster point joins after grouping as X, and this grouping process will stop automatically.When packet count approaches a certain value (largest packet number), and grouping process is not when still automatically stop, and grouping process will be terminated.
3125) result set of cluster point will join transient state pond as a new grouping.Then return to step 3122) until all cluster points be grouped.
32) bottom layer node is through step 31) after the learning stage finishes, bottom layer node enters the derivation stage, new sample (binary sequence pattern) input is after bottom layer node is derived, export to its father node that is positioned at one deck under HTM network (for child node), there is the output of child node of identical father node after connecting, become the input of father node learning phase, father node enters learning phase repeating step 31) in the learning process of node.
Be illustrated in figure 4 the derivation stage schematic diagram that the binary sequence pattern of input is carried out at node, step 32) specifically comprise the following steps:
321) the probability distribution P (e of the spatial model of the binary sequence pattern of calculating input based on pond, space -| c i), after Regularization as the output vector y in pond, space.
The spatial model that learning phase input binary sequence pattern generates in pond, space is i thcluster centre c i, derivation node bottom list entries e -based on i thprobability distribution P (the e of cluster centre -| c i) be variable, can calculate by following formula:
P ( e - | c i ) = γ ∏ k = 1 M input ( m k i ) - - - ( 1 )
In formula (1), γ is proportionality constant, and i cluster centre is expressed as
Figure GDA0000393608130000092
,
Figure GDA0000393608130000093
represent non-vanishing spatial model, M is the quantity of the child node of this node, e -for the list entries to be identified from bottom.
Figure GDA0000393608130000094
representative input, if this node is bottom node,
Figure GDA0000393608130000095
input binary sequence pattern for this node; If this node is not bottom node,
Figure GDA0000393608130000096
for the transient state pond output probability distribution of the child node from this node,
Figure GDA0000393608130000102
(to P (e -| g i) computing formula see formula (4)).All i thcluster centre c iprobability distribution all can pass through P (e -| c i) calculate, then by P (e -| c i) canonical turns to vectorial y (i), therefore has y (i) and P (e -| c i) proportional, can be designated as y (i) ∝ P (e -| c i), all y (i) have formed the output vector y in this node space pond, are designated as y=[y (1), y (2) ..., y (N c)] (N cfor space pool space pool space number of modes), all P (e -| c i) formed P (e -| C) be designated as P (e -| C)=[P (e -| c 1), P (e -| c 2) ...,
Figure GDA0000393608130000103
, therefore have y and P (e -| C) proportional, be designated as y ∝ P (e -| C).
322) the output vector y based on pond, space, the output in calculating transient state pond.
Transient state pond application Belief Propagation principle is carried out reasoning.As shown in Figure 4, pond, space is output as vectorial y, and this vector length is N c(being also space pool space number of modes), in vector, i element is corresponding to i cluster centre c i
Figure GDA0000393608130000104
, these cluster centres as vector [ (length is M), wherein r represents the subgroup index of these cluster centres.I the element computing formula of y is:
( i ) = α 1 ∏ j = 1 M λ m i ( r m j ) - - - ( 2 )
In formula (2), α 1be a random scaling constant, for fear of the underflow of information, it is set to fixed value conventionally, and M is child node number,
Figure GDA0000393608130000106
expression is from child node m ibinary sequence pattern,
Figure GDA0000393608130000107
represent i cluster centre from child node m isubgroup index.
According to formula (1) and step 321 procedure declaration, at node, k has y kwith P (e -| C k) proportional, i.e. y k∝ P (e -| C k), y kwith P (e -| C k) be respectively y and P (e- |c) at the example at node k place.
Output is calculated in the input of transient state pond based on pond, space.Be output as λ, its length is N g(transient state pond time packet count), λ=[λ (1), λ (2) ..., λ (N c)] i element computing formula as follows:
λ ( i ) = Σ j = 1 N c P ( c j | g i ) y ( j ) - - - ( 3 )
In formula (3), P (c j| g i) represent spatial model c jfor the g that divides into groups in transient state pond iconditional probability distribution, y (j) representative is from j the element of the y in pond, space, the value of j is 1-N c.Due to y (j) ∝ P (e -| c j), and
P ( e - | g i ) = Σ j = 1 N c P ( c j | g i ) P ( e - | c j ) - - - ( 4 )
P (e wherein -| g i) represent bottom list entries e -based on transient state pond grouping g iprobability distribution, P (e -| G k) be the upper all P (e of node k -| g i) the vector of formation.
So λ (i) ∝ P (e -| g i) for all i, set up, on node k, there is λ kwith P (e -| G k) proportional, i.e. λ k∝ P (e -| G k).The output in transient state pond is exactly the output of this node.
33) repeating step 32) process, until the node of all layer of HTM network has all completed the learning training of binary sequence pattern.
As step 32), after one deck has been trained, this node layer proceeds to the derivation stage, and next node layer (father node) utilizes the output of last layer node (child node) as input, to carry out the study of sequence pattern.
4) utilize byte level n-gram to extract the proper vector of the file in test set.
As step 1), utilize byte level n-gram to extract the proper vector of test centralized documentation.The malicious code test data that test set can provide from network is concentrated and is chosen.
5) the HTM network that proper vector has been input to training carries out recognition sequence, to determine whether the file in test set contains malicious code.
As step 322) derivation, in whole HTM structure, all node layers are all in the derivation stage, utilize the proper vector that pond, space sequence pattern is derived and the grouping derivation of transient state pond is extracted step 4) to carry out pattern derivation, the output vector λ of top mode is the output mode vector of whole HTM network, the output probability P (e in top mode transient state pond -| G k) be malicious code matching rate.As the output probability P in top mode transient state pond (e -| G k) when enough large, such as we are set as being greater than 85%, we just can think that input file contains malicious code so, otherwise think there is no malicious code.
The present invention is usingd the sample set of malicious file as training set, utilizes HTM algorithm for pattern recognition training HTM network, then utilizes HTM network to carry out pattern-recognition to unknown file and classification is derived, to determine whether it is malicious file.In the process of file being carried out to feature extraction, adopt byte level n-gram algorithm, a large amount of file characteristic attributive character is extracted.In pattern-recognition and classification learning algorithm, introduce HTM algorithm, this algorithm is to imitate the structure of mankind's neopallium and the novel artificial intelligent algorithm of principle of work, in its application Bayesian network, between node, information continues to share principle and degree of belief transfer principle, challenge is converted into pattern match and prediction, through training, extract spatial sequence pattern and the temporal mode grouping of sample, and utilize Belief Propagation method that each layer of local mode group gathered to classification, finally obtain one-piece pattern group, at cognitive phase, according to the sequence pattern of each layer of study, through overmatching, complete malicious code sample identification.HTM algorithm, because of its good anti-noise, fault-tolerant, adaptability, self-learning capability, can effectively improve discrimination.
The above; be only the present invention's embodiment preferably, but protection scope of the present invention is not limited to this, is anyly familiar with in technical scope that those skilled in the art disclose in the present invention; the variation that can expect easily or replacement, within all should being encompassed in protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claim.

Claims (5)

1. the degree of deep learning detection method of unknown malicious code, is characterized in that, comprises the following steps:
1) utilize byte level n-gram to extract the proper vector of training centralized documentation;
2) build the HTM network model of a multilayer, and by intercepting document characteristic vector method, determine the input data length of each node of bottom in HTM structure;
3) using the document characteristic vector that intercepts as the input of HTM network, after all nodes study of bottom, through the successively output in derivation stage, connect, complete the learning training of all node layers of HTM network;
4) utilize byte level n-gram to extract the proper vector of the file in test set;
5) the HTM network that proper vector has been input to training carries out recognition sequence, to determine whether the file in test set contains malicious code.
2. the degree of deep learning detection method of unknown malicious code according to claim 1, is characterized in that, described step 2) specifically comprise the following steps:
21) select the HTM network model of a F layer, outer each node of definite division bottom has M child node;
22) utilize formula l=L/M (F-1) intercepting document characteristic vector, and using the document characteristic vector of intercepting successively as the input sample of each node of HTM network bottom layer, wherein F is the number of plies of HTM network structure, M is the quantity of the child node except bottom exterior node, L is the document characteristic vector length of utilizing byte level n-gram method to extract, and l is the input sample length of each node of HTM network bottom layer.
3. the degree of deep learning detection method of unknown malicious code according to claim 1, is characterized in that, described step 3) specifically comprises the following steps:
31) using the document characteristic vector that intercepts as the input of HTM network, bottom layer node enters learning phase, until the pond, space of all nodes of bottom all completes study, all study of deadline grouping of transient state pond of spatial model;
32) bottom layer node is through step 31) after the learning stage finishes, bottom layer node enters the derivation stage, new sample input is after bottom layer node is derived, export to the father node that it is positioned at one deck under HTM network, there is the output of lower level node of identical father node after connecting, become the input of next node layer learning phase, next node layer enters learning phase repeating step 31) in the learning process of node;
33) repeating step 32) process, until the node of all layer of HTM network has all completed the learning training of sequence pattern.
4. the degree of deep learning detection method of unknown malicious code according to claim 3, is characterized in that, described step 31) specifically comprise the following steps:
311) binary sequence that is input to node is input to pond, space, and the cluster of these sequences is learnt in pond, space with ultimate range D; Pond, space is used the method for ultimate range D to store the subset of input pattern, is called cluster centre; Along with the increase of time, the quantity of the new sequence pattern that pond, space produces within the unit interval can reduce, when the quantity of the new cluster centre of each time cycle lower than set threshold value time, cluster process will stop;
312) transient state pond is exported to the sequence pattern of having learnt in pond, space, divides into groups to sequence pattern according to the time adjacency of sequence pattern in transient state pond, until after all sequence patterns are all grouped, grouping is calculated and finished.
5. the degree of deep learning detection method of unknown malicious code according to claim 3, is characterized in that, described step 32) specifically comprise the following steps:
321) utilize formula
Figure FDA0000393608120000031
calculate list entries e -spatial model c based on node space pond iprobability distribution, after Regularization as the output in pond, space, wherein
Figure FDA0000393608120000032
represent the spatial model of non-zero, M is the quantity of the child node of this node, e -for the list entries to be identified from bottom;
322) the output y based on pond, space, utilizes formula
Figure FDA0000393608120000033
calculate the output in transient state pond, wherein, N cfor vector length and the space pool space number of modes of y, λ length is N g.
CN201110373558.0A 2011-11-22 2011-11-22 Deep learning detection method of unknown malicious codes Expired - Fee Related CN102411687B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110373558.0A CN102411687B (en) 2011-11-22 2011-11-22 Deep learning detection method of unknown malicious codes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110373558.0A CN102411687B (en) 2011-11-22 2011-11-22 Deep learning detection method of unknown malicious codes

Publications (2)

Publication Number Publication Date
CN102411687A CN102411687A (en) 2012-04-11
CN102411687B true CN102411687B (en) 2014-04-23

Family

ID=45913758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110373558.0A Expired - Fee Related CN102411687B (en) 2011-11-22 2011-11-22 Deep learning detection method of unknown malicious codes

Country Status (1)

Country Link
CN (1) CN102411687B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544392B (en) * 2013-10-23 2016-08-24 电子科技大学 Medical science Gas Distinguishing Method based on degree of depth study
CN104715190B (en) * 2015-02-03 2018-02-06 中国科学院计算技术研究所 A kind of monitoring method and system of the program execution path based on deep learning
CN105205396A (en) * 2015-10-15 2015-12-30 上海交通大学 Detecting system for Android malicious code based on deep learning and method thereof
CN105975857A (en) 2015-11-17 2016-09-28 武汉安天信息技术有限责任公司 Method and system for deducing malicious code rules based on in-depth learning method
WO2017181286A1 (en) * 2016-04-22 2017-10-26 Lin Tan Method for determining defects and vulnerabilities in software code
CN106096415B (en) * 2016-06-24 2019-05-21 康佳集团股份有限公司 A kind of malicious code detecting method and system based on deep learning
CN107066302B (en) * 2017-04-28 2019-11-05 北京邮电大学 Defect inspection method, device and service terminal
CN107688822B (en) * 2017-07-18 2021-07-20 中国科学院计算技术研究所 Newly added category identification method based on deep learning
WO2019025385A1 (en) * 2017-08-02 2019-02-07 British Telecommunications Public Limited Company Detecting malicious configuration change for web applications
EP3721342B1 (en) 2017-12-04 2023-11-15 British Telecommunications public limited company Software container application security
CN108183902B (en) * 2017-12-28 2021-10-22 北京奇虎科技有限公司 Malicious website identification method and device
CN110119505A (en) * 2018-02-05 2019-08-13 阿里巴巴集团控股有限公司 Term vector generation method, device and equipment
CN109858251B (en) * 2019-02-26 2023-02-10 哈尔滨工程大学 Malicious code classification detection method based on Bagging ensemble learning algorithm
CN111901282A (en) * 2019-05-05 2020-11-06 四川大学 Method for generating malicious code flow behavior detection structure

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101304409A (en) * 2008-06-28 2008-11-12 华为技术有限公司 Method and system for detecting malice code
CN101976313A (en) * 2010-09-19 2011-02-16 四川大学 Frequent subgraph mining based abnormal intrusion detection method
CN102142068A (en) * 2011-03-29 2011-08-03 华北电力大学 Method for detecting unknown malicious code
US8056136B1 (en) * 2010-11-01 2011-11-08 Kaspersky Lab Zao System and method for detection of malware and management of malware-related information
CN102243699A (en) * 2011-06-09 2011-11-16 深圳市安之天信息技术有限公司 Malicious code detection method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101304409A (en) * 2008-06-28 2008-11-12 华为技术有限公司 Method and system for detecting malice code
CN101976313A (en) * 2010-09-19 2011-02-16 四川大学 Frequent subgraph mining based abnormal intrusion detection method
US8056136B1 (en) * 2010-11-01 2011-11-08 Kaspersky Lab Zao System and method for detection of malware and management of malware-related information
CN102142068A (en) * 2011-03-29 2011-08-03 华北电力大学 Method for detecting unknown malicious code
CN102243699A (en) * 2011-06-09 2011-11-16 深圳市安之天信息技术有限公司 Malicious code detection method and system

Also Published As

Publication number Publication date
CN102411687A (en) 2012-04-11

Similar Documents

Publication Publication Date Title
CN102411687B (en) Deep learning detection method of unknown malicious codes
CN115018021B (en) Machine room abnormity detection method and device based on graph structure and abnormity attention mechanism
CN104598611B (en) The method and system being ranked up to search entry
CN109561084A (en) URL parameter rejecting outliers method based on LSTM autoencoder network
CN105824802A (en) Method and device for acquiring knowledge graph vectoring expression
CN108228877B (en) Knowledge base completion method and device based on learning sorting algorithm
CN105205448A (en) Character recognition model training method based on deep learning and recognition method thereof
CN105095494B (en) The method that a kind of pair of categorized data set is tested
CN113780002A (en) Knowledge reasoning method and device based on graph representation learning and deep reinforcement learning
CN103455612B (en) Based on two-stage policy non-overlapped with overlapping network community detection method
CN109657802A (en) A kind of Mixture of expert intensified learning method and system
CN105678401A (en) Global optimization method based on strategy adaptability differential evolution
CN109325125A (en) A kind of social networks rumour method based on CNN optimization
CN111400713B (en) Malicious software population classification method based on operation code adjacency graph characteristics
CN115687925A (en) Fault type identification method and device for unbalanced sample
CN115617395A (en) Intelligent contract similarity detection method fusing global and local features
CN113904844A (en) Intelligent contract vulnerability detection method based on cross-modal teacher-student network
CN104899283A (en) Frequent sub-graph mining and optimizing method for single uncertain graph
CN117421571A (en) Topology real-time identification method and system based on power distribution network
Lingyu et al. SMAM: Detecting rumors from microblogs with stance mining assisting task
CN109859062A (en) A kind of community discovery analysis method of combination depth sparse coding device and quasi-Newton method
Jin et al. A center-based community detection method in weighted networks
CN115051843A (en) KGE-based block chain threat information knowledge graph reasoning method
CN104866588A (en) Frequent sub-graph mining method aiming at individual uncertain graph
CN114780103A (en) Semantic code clone detection method based on graph matching network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140423

Termination date: 20141122

EXPY Termination of patent right or utility model