CN102360408A - Detecting method and system for malicious codes - Google Patents

Detecting method and system for malicious codes Download PDF

Info

Publication number
CN102360408A
CN102360408A CN2011102901723A CN201110290172A CN102360408A CN 102360408 A CN102360408 A CN 102360408A CN 2011102901723 A CN2011102901723 A CN 2011102901723A CN 201110290172 A CN201110290172 A CN 201110290172A CN 102360408 A CN102360408 A CN 102360408A
Authority
CN
China
Prior art keywords
malicious code
sequence
main control
control end
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011102901723A
Other languages
Chinese (zh)
Inventor
郑礼雄
孙波
许俊峰
严寒冰
王伟平
袁春阳
林绅文
杨鹏
向小佳
王永建
王进
张伟
郭承青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
National Computer Network and Information Security Management Center
Original Assignee
Institute of Computing Technology of CAS
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS, National Computer Network and Information Security Management Center filed Critical Institute of Computing Technology of CAS
Priority to CN2011102901723A priority Critical patent/CN102360408A/en
Publication of CN102360408A publication Critical patent/CN102360408A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a detecting method and a system for malicious codes, which are assisted by an OLAP engine, characterized by interaction behaviors of a main control end and a controlled end of malicious codes and are based on excavation, wherein the detecting method comprises a step 101 of detecting a network; a step 201 of extracting suspicious main control end behaviors; steps 301-303 of expanding a training set; a step 401 of training a classifier; a step of 501 of brushing Cube; steps 601-606 of excavating sequences; and a step 701 of generating a rule. The detecting system comprises a detecting module, a training sample pool, an SVM classifier, a relation database, an OLAP engine, a characteristic sequence excavation engine and a knowledge base.

Description

The detection method of malicious code and system thereof
Technical field
The present invention relates to a kind of detection method and system thereof of malicious code, more particularly, relate to a kind of based on malicious code detecting method and malicious code network behavior signature analysis and the detection system excavated.
Background technology
Along with popularizing of internet, it is more and more important that information security becomes, and information security also becomes an important research field that receives much concern.Since the defective that Internet itself designs with and the opening that has, make it very easily under attack.In the Internet security incident, the economic loss that malicious code causes is occupied maximum ratio, and meanwhile, malicious code becomes the important means of information war, the network warfare.More serious is that the kind of malicious code, velocity of propagation, infection quantity and coverage are all strengthening gradually.
Aspect the detection of malicious code; There has been software to detect to comprise the multiple malicious code of worm and wooden horse; But these The software adopted all is the condition code detection technique; Unknown code is carried out feature extraction, compare with known malicious code characteristic then, thereby judge whether it is malicious code.Such method rate of false alarm is low, but must before detecting a kind of malicious code, obtain its characteristic earlier, has increased computer system and has been infected and possibility under attack.Emerging malicious code also becomes and becomes increasingly complex; Polymorphic technological through adopting with blurring mappings such as distortion; The viability of malicious code has obtained enhancing, generally can't discern based on the instrument of condition code detection technique, and this has brought very big challenge for the work of security fields.
Summary of the invention
Therefore, the objective of the invention is to propose a kind of OLAP engine auxiliary, with malicious code main control end controlled terminal interbehavior be characteristic, based on malicious code detecting method that excavates and corresponding prototype system thereof.
The invention provides a kind of detection method of malicious code, may further comprise the steps: detection network at first, monitor all network messages, obtain first-hand primitive network data; Extract suspicious main control end behavior then; In knowledge base under the help of characteristic rule; Detect suspicious malicious code behavior,, from network message, extract the behavioural characteristic data relevant with these main control end according to the information such as malicious code main control end address that write down in the knowledge base; Expanding training set, under expert's help, in database, possibly be in the data importing training sample pond of malicious code new variant behavioural characteristic with the data storage of representing suspicious main control end behavioural characteristic; Training classifier is input with existing sample, extracts its characteristic, sets up feature lexicon, is used for optimizing svm classifier device parameter; Refresh Cube,, also upgrade corresponding fact table in the relational database, the behavior sequence of new importing is carried out burst, upgrade each dimension table and fact table to the renewal of training set; Sequence is excavated, and excavates engine and on new Cube, carries out new frequent sequence mining task, obtains new rule; Create-rule inserts new regulation in the rule tree, upgrades relevant structures such as index, with the validity in maintenance knowledge storehouse.
Wherein, said extraction main control end suspicious actions may further comprise the steps: network is divided into the overall situation surveys and directional detection; Survey through the overall situation; Find in the network interaction behavioural characteristic and knowledge base characteristic rule matched sequence fully, inquire about this sequence corresponding main control end IP address and port, inquire about this IP address whether in suspicious main control end set; If the result then expands suspicious main control end set for not; Through directional detection, all behavior sequences of suspicious main control end necessity the expert is helped to gather warehouse-in down, in order to excavating, if in the sequence excavation step the new frequent behavior sequence of discovery, the rule in the storehouse that then expands knowledge.
Wherein, the session sequence on the absolute time being cut into slices, is that index is confirmed section boundaries with edge frequency and border entropy, and the session subsequence that obtains after the segmentation has independently relative time, is the foundation stone that frequent sequence is excavated.
Wherein, said dimension comprises time dimension, kind dimension, geography dimensionality, ISP dimension.
Wherein, in said refresh step, increase new dimension.
Wherein, said sequence excavate comprise phase sorting, obtain the sport collection stage, translate phase, excavation phase, feature selecting stage.
Wherein, the feature selecting index comprises, is dimension with the malicious code kind, and different behavioural characteristics appear at the mutual information and the mean square deviation of the probability in the variety classes.
The invention provides a kind of malicious code detection system, comprising: detecting module, be used for smelling the data of visiting network, grasp the datagram of specific characteristic according to configurable rule; The training sample pond is used to deposit initial sample, and said sample divides good type in advance by the expert; The svm classifier device, this sorter can be classified to the data after the dynamic expansion according to the study to training sample automatically, and is responsible for classified information is loaded in the database; Relational database; Be used for depositing the behavior sequence that corresponding heterogeneous networks that sample file extracts connects; Said relational database is supported the visit of malicious code behavior sequence burst program, and the behavior sequence behind the burst generates new fact table and dimension table through further clear and conversion; Leave in the relation data, supply olap analysis to use; The OLAP engine is used for reading the fact table and the dimension table of relational database, sets up the stereo data pattern; Be Cube, confirm statistical value, in native system, be set at the number of times that specific behavior occurs as tolerance; Confirm different dimensions, the MDX on the Cube is provided query interface; Characteristic sequence excavates engine, can find the pattern in the malicious code behavior sequence burst according to a kind of Apriori improved algorithm, and the realization program of this algorithm can visit the metric among the Cube through the MDX query interface simultaneously, finds frequent fast; Knowledge base is used for depositing the malicious code network behavior characteristic sequence of discovery, and it is become rule tree as regular weaves, is convenient to inquiry.
Wherein, said sample pool can dynamically expand, and the data that real-time tracing specifies the malicious code main control end to send are found new malicious code mutation.
Purpose according to the invention, and in these other unlisted purposes, in the scope of the application's independent claims, be able to satisfy.Embodiments of the invention are limited in the independent claims, and concrete characteristic is limited in its dependent claims.
Description of drawings
Followingly specify technical scheme of the present invention with reference to accompanying drawing, wherein:
Fig. 1 has shown the detection method of malicious code of the present invention;
Fig. 2 has shown the detection system of malicious code of the present invention;
Fig. 3 has shown behavioural characteristic sequence Trie tree construction synoptic diagram among the present invention.
Embodiment
Following with reference to accompanying drawing and combine schematic embodiment to specify the characteristic and the technique effect thereof of technical scheme of the present invention, the present invention proposes a kind of OLAP engine auxiliary, with malicious code main control end controlled terminal interbehavior be characteristic, based on the malicious code detecting method that excavates; This method is utilized the OLAP engine, can improve magnanimity behavioural characteristic processing speed of data through setting up Cube, and friendly data query access interface is provided; Simultaneously, the behavior that this method can real-time tracing malicious code main control end changes, and guarantees in time automatically to refresh one's knowledge the storehouse, obtains the information of malicious code mutation.
The detection of whole malicious code interbehavior characteristic and analysis process such as Fig. 1, at first detection network step 101; Extract suspicious main control end behavior 201 then; Next expand training set 301-303; Training classifier 401; Refresh Cube501; Sequence is excavated 601-606; Create-rule 701.
The architecture of the prototype system that above-mentioned detection method is corresponding is as shown in Figure 2, excavates engine and knowledge base comprising detecting module, training sample pond, svm classifier device, relational database, OLAP engine, characteristic sequence.
Below in conjunction with accompanying drawing, the detection method among the present invention is elaborated:
The network message that the monitoring of 101 detecting modules is all; Obtain first-hand primitive network data; This process need be by the help of kernel module in implementation procedure, adopts the mechanism of similar tcpdump to obtain all detection network interfaces (IF) and smell the network message of visiting.
201 detecting modules carry out initial analysis to the network message that detects, and adjust the detection rule of malicious code automatically through the network behavior variation of following the tracks of main control end, in time find the new variant of malicious code, particularly, comprise the steps:
A. network is divided into the overall situation and surveys and directional detection, all datagrams on the former phase-split network, the suspicious main control end of latter's ad hoc analysis, in the realization, corresponding two kinds of detection analysis tasks are safeguarded the resource that it is special-purpose respectively, like process pool etc.;
B. survey through the overall situation, with the regular matched sequence fully of knowledge base characteristic, inquire about this sequence corresponding main control end IP address and port in the discovery network interaction behavioural characteristic;
C. inquire about this IP address whether in suspicious main control end set, if the result then expands suspicious main control end set for not;
D. through directional detection, all behavior sequences of suspicious main control end are helped to gather warehouse-in down in necessity of expert, in order to excavating, if in the sequence excavation step the new frequent behavior sequence of discovery, the rule in the storehouse that then expands knowledge.
What deposit in the 301 training sample ponds is through the network message sequence after the detecting module Analysis and Screening, and this sequence has been passed through abstract in expression, and each message in the sequence is stored in the table of corresponding storehouse; As a record, and the characteristic that follow-up mining analysis is paid close attention to is extracted out the row as the storehouse table, and for example each message is interpreted as following record with element group representation: (source IP; Source port, purpose IP, destination interface; Time of origin, content, kind ...);
302 help to survey for the overall situation down in necessity of expert, not as new samples, but from wherein finding new suspicious main control end, of 201-B after its result puts in storage; For directional detection, its warehouse-in record can continue according to (source IP, source port; Purpose IP, destination interface) extract, sort according to time of origin; Preliminary constitute the session sequence and then be loaded among the Xin Kubiao, this part record can be proceeded burst, get into Cube, mining rule;
The resulting session sequence of 303 steps 302 is the flow of event on the absolute time, includes the characteristic of the interactive behavior of malicious code, but is wherein comprising the repeatedly session of master control and controlled terminal; So-called section is exactly through with the flow of event segmentation, finds the border of session because in the analysis of back, we need regard each session as an independently unit, excavation be the frequent mode that occurs in numerous sessions.
The core of slicing algorithm is to find the border, and for reaching this purpose, we have used a degree of depth to deposit each d-gram of the interactive behavioural characteristic sequence of malicious code as the Trie tree construction (like Fig. 3) of d+1; Algorithm use two indexs to come be the boundary candidate scoring: edge frequency and border entropy; Knot Fig. 3 explains these indexs below.
As shown in Figure 3, example session sequence is ABCABC, and an interbehavior represented in each letter, represents main control end to the controlled terminal transmitting control commands like A; This sequence is divided into the 2-gram subsequence, then the number of times of behavior that occurs in the subsequence and appearance thereof is noted, just constituted Tri e tree as shown in the figure.Suppose that a sequence of putting in storage in 302 steps is ABCABC, wherein a conversation recording among the Ku Biao, i.e. (source IP, source port, purpose IP, destination interface, time of origin, synopsis, kind ...) represented in each letter; All 2-gram of this sequence are so: AB, BC, CA, AB, BC; It is in 3 the Trie tree that Fig. 3 is organized in a degree of depth with these 2-gram; Except root node; Two values of each nodes records; One is the conversation recording that node characterizes, and another is the frequency f of node, and representative repeats the number of times that occurs from root node whole sequence to the subsequence that this node characterized; For example the Far Left subtree contains two nodes, top node representative record A, and frequency is 2, explains that through after the division of 2-gram, the number of times that this sub-sequence of A occurs in the whole sequence is 2 times; Leaf node representative record B, frequency is 2, explains that through after the division of 2-gram, the number of times that this sub-sequence of AB occurs in the whole sequence is 2 times; Above-mentioned nodal frequency is undertaken after the normalization to be exactly the edge frequency of this node, to see formula (3) for details by the level of its residing Trie tree; Entropy also is the attribute of each node, the rate of change of representation node during as boundary node, and it is exactly the border entropy that its value is undertaken after the normalization by the level of its residing Trie tree, and concrete account form is seen formula (1) (2) (4);
Slicing algorithm is specific as follows:
Algorithm: seek the session border among the behavior sequence S
Input: the degree of depth is Trie tree T and the behavior sequence S of d+1
Output: by the subsequence S that represents each session behind the burst of border Session
Each nonleaf node x among the for T:
Calculate its child node x i(the probability distribution function of 1≤i≤m)
P(x i)=f(x i)/f(x) (1);
Calculate the entropy of x
e ( x ) = - Σ i = 1 m P ( x i ) log P ( x i ) - - - ( 2 )
end
Each layer L among the for T:
Calculate the average frequency
Figure BSA00000582922200061
and the average entropy
Figure BSA00000582922200062
Calculate the standard variance σ of f and e respectively fAnd σ e
Each node x in the for L layer:
Calculate the edge frequency of this node:
f ( x ) = f ( x ) - f ‾ σ f - - - ( 3 )
Calculate the border entropy of this node:
e ( x ) = e ( x ) - e ‾ σ e - - - ( 4 )
end
end
If the session son sequence set is combined into R=φ;
If each cut-off among the S constitutes cutting point set P;
The scoring of each point is left among the array scores [] among the P;
Each d-gram S ' among the for S:
Total d+1 position (containing two ends) can be used as cut-off, constitutes set P ' (P ' ∈ P);
Each cut-off p of for:
Is two sections S ' through p with S ' cutting PlAnd S ' Pr
S ' PlAnd S ' PrRespectively corresponding two node x on T S ' plAnd x S ' pr
F(p)=f(x S′pl)+f(x S′pr) (5)
E(p)=e(x S′pl)+e(x S′pr) (6)
end
Scores [i] ++ wherein, ( i ∈ P ′ ) ^ { ∀ p ∈ P ′ → ( F ( i ) ≥ F ( p ) } ;
Scores [j] ++ wherein, ( j ∈ P ′ ) ^ { ∀ p ∈ P ′ → ( E ( j ) ≥ E ( p ) } ;
end
start_index=1;
For i=2 to (| P|):
Get i cut-off P among the P i
if
Figure BSA00000582922200067
end_index=i;
Extract among the S with p Start_indexAnd P End_indexSubsequence S for the border Session
S SessionBe the session subsequence, join among the result set R;
start_index=end_index;
fi
end
401 with the session sequence warehouse-in that obtains after the section in 303 steps as training sample, svm classifier device module at first use characteristic selection algorithm is set up feature lexicon, and the data that training sample is concentrated are carried out character representation; Next training set data is carried out normalization and handle, use the svm classifier algorithm training to go out sorter; And this sorter is used for fast confirming the kind of session sequence representative, and which kind of malicious code what be used for identifying the representative of section back warehouse-in session sequence is, and kind of information is increased in the conversation description table of database as an attribute.
The structure of 501Cube pattern (Schema) and refresh and must confirm dimension values and metric, dimension values is used for describing the coordinate system that multi-dimension data cube Cube sets up, and same latitude can mark off different grade; Metric is the statistical value that mining algorithm is concerned about, is set at the frequency of confirming that the interbehavior characteristic occurs under the dimension in the native system; For dimension, specifically comprise in the native system:
A. time dimension, the time here is logical time, is used for describing the sequencing of each interbehavior in the session;
B. kind dimension, the kind here refers to the malicious code kind that in 401 steps, is determined by the svm classifier device; To be the svm classifier device get through sample training study the division methods of this kind, and along with the renewal of sample pool, this kind is divided also can respective change; The kind dimension is extremely important for the feature selecting in the 601 step mining algorithms, through calculating the mutual information of frequent session sequence probability of occurrence between variety classes, i.e. and gain, we can filter out the session sequence that can characterize the malicious code characteristic;
C. geography dimensionality, this dimension is decided to the conversion in geographic position by the IP address, and is hierarchical; The superlative degree is regional dimension, secondly is national dimension, next is province or state, city; In Cube through on the volume drill down operator can obtain the statistical value on different level geography dimensionalities easily; This value can provide ancillary statistics information in excavation, realize abundanter characteristic sequence selection function, for example excavates malicious code characteristic feature sequence within Chinese territory; In the realization, obtain the relevant expert's in dependence center manual maintenance in the database of dimension table by national network safety response coordination center of this dimension correspondence;
The D.ISP dimension, this dimension is decided to the conversion of respective operator by the IP address, in excavation, ancillary statistics information can be provided; In the realization, the corresponding dimension table of this dimension also is to obtain in the database by national network safety response coordination center, the relevant expert's in dependence center manual maintenance.
The 601 modified behavioural characteristic method for digging based on Apriori are used for finding the validity feature sequence in the network behavior; This algorithm combination Cube mechanism can carry out the metric (mainly being frequency) of individual features going up the volume operation fast, adds up its overall support; Carry out quick drill down operator, the mutual information that the computing session sequence occurs in different classes of, and select characteristic, this method for digging to be divided into ordering, to obtain sport collection, conversion, excavation, feature selecting totally 5 stages in view of the above;
602 phase sortings are grouping with different kinds through the MDX query statement from Cube, are preface with logical time in the session, extract the session behavior sequence;
603 obtain sport collection (LitemSet) stage; This stage secundum legem Apriori algorithm, wherein support is decided to be the session number that comprises respective items; As an item collection, each feature field to writing down of concentrating is like synopsis with each record in the session sequence; The item collection that the sport collection is not promptly comprised for other any collection;
604 translate phases; Convert LitemSet to numeral, and convert each record in the session sequence among the LitemSet element representation, be mapped as corresponding numeral;
605 excavation phase; It is 1 candidate sequence collection L1 that this stage is regarded IitemSet as length, constantly expands the length of candidate sequence, and calculated candidate person's support is finally confirmed available sequences collection R; Specific as follows:
for(i=2;L i-1≠φ;i++):
Pass through L I-1Mutual splicing produce interim candidate sequence collection C i
Calculate C iIn the frequency that in database session, occurs of each sequence;
From C iMiddle deletion support does not reach the sequence of threshold value, obtains candidate sequence collection L i
end
From U iL iThe middle maximal sequence of selecting promptly not for the sequence that other any sequences comprised, is made as R;
606 feature selecting stages; Because the rule in the knowledge base is used for instructing the extracting of suspicious data newspaper, be to reduce rate of false alarm, exceed the characteristic sequence of threshold value for the determined support of above-mentioned steps, also need do feature selecting based on mutual information, concrete steps are following:
Based on the distribution of different behavioural characteristics in the variety classes dimension; Ask its mutual information; And, select can characterize the characteristic sequence of malicious code behavior with the validity that the mean square deviation of distribution function is come further analytical characteristic as weighting parameters, its mean square deviation weighting mutual information formula is following:
IW ( i ) = σ i Σ v i = 0 1 Σ C j log P ( v i , C j ) P ( v i ) P ( C j ) - - - ( 7 )
In the above-mentioned formula, C jRepresent the j class malicious code that according to type dimension distinguished; v iRepresent i behavioural characteristic and C jAssociation, v iI behavioural characteristic of=0 expression is at C jV did not appear in the malicious code of kind iI behavioural characteristic of=1 expression is at C jOccurred in the malicious code of kind; P (v i, C j) type of being illustrated in C jIn, the eigenwert of i behavioural characteristic is v iRatio; P (v i) represent that the eigenwert of i behavioural characteristic is v iRatio; P (C j) the shared ratio of expression j class malicious code; σ iBe i the mean square deviation that behavioural characteristic distributes in all kinds of malicious codes, this value characterizes the equilibrium degree that i behavioural characteristic distributes, and the big more distribution of its value is unbalanced more, can distinguish different types of malicious code more.
701 will go up the select frequent session sequence of a step characteristic as new regulation, insert in the rule tree, upgrade relevant structures such as index, with the validity in maintenance knowledge storehouse.
According to detection method of the present invention and system, through utilizing the OLAP engine, and, can improve, and friendly data query access interface is provided magnanimity behavioural characteristic processing speed of data through setting up Cube; Simultaneously, the behavior that this method can real-time tracing malicious code main control end changes, and guarantees in time automatically to refresh one's knowledge the storehouse, obtains the information of malicious code mutation.
Although with reference to one or more exemplary embodiments explanation the present invention, those skilled in the art can know and need not to break away from the scope of the invention and the detection method and the architecture thereof of malicious code are made various suitable changes and equivalents.In addition, can make by disclosed instruction and manyly possibly be suitable for the modification of particular condition or step and do not break away from the scope of the invention.Therefore, the object of the invention does not lie in and is limited to as being used to realize preferred forms of the present invention and disclosed specific embodiment, and disclosed device architecture and manufacturing approach thereof will comprise all embodiment that fall in the scope of the invention.

Claims (9)

1. the detection method of a malicious code may further comprise the steps:
Detection network is monitored all network messages, obtains first-hand primitive network data;
Extract suspicious main control end behavior; In knowledge base under the help of characteristic rule; Detect suspicious malicious code behavior,, from network message, extract the behavioural characteristic data relevant with these main control end based on the information that comprises malicious code main control end address information that writes down in the knowledge base;
Expanding training set, under expert's help, in database, possibly be in the data importing training sample pond of malicious code new variant behavioural characteristic with the data storage of representing suspicious main control end behavioural characteristic;
Training classifier is input with existing sample, extracts its characteristic, sets up feature lexicon, is used for optimizing svm classifier device parameter;
Refresh Cube,, also upgrade corresponding fact table in the relational database, the behavior sequence of new importing is carried out burst, upgrade each dimension table and fact table to the renewal of training set;
Sequence is excavated, and excavates engine and on new Cube, carries out new frequent sequence mining task, obtains new rule;
Create-rule inserts new regulation in the rule tree, upgrades the relevant structure that comprises index, with the validity in maintenance knowledge storehouse.
2. the detection method of malicious code as claimed in claim 1, said extraction main control end suspicious actions may further comprise the steps: network is divided into the overall situation surveys and directional detection; Survey through the overall situation; Find in the network interaction behavioural characteristic and knowledge base characteristic rule matched sequence fully, inquire about this sequence corresponding main control end IP address and port, inquire about this IP address whether in suspicious main control end set; If the result then expands suspicious main control end set for not; Through directional detection, all behavior sequences of suspicious main control end necessity the expert is helped to gather warehouse-in down, in order to excavating, if in the sequence excavation step the new frequent behavior sequence of discovery, the rule in the storehouse that then expands knowledge.
3. the detection method of malicious code as claimed in claim 1 wherein, is cut into slices the session sequence on the absolute time, is that index is confirmed section boundaries with edge frequency and border entropy, and the session subsequence that obtains after the segmentation has independently relative time.
4. the detection method of malicious code as claimed in claim 1, wherein, said dimension comprises time dimension, kind dimension, geography dimensionality, ISP dimension.
5. the detection method of malicious code as claimed in claim 1 wherein, increases new dimension in said refresh step.
6. the detection method of malicious code as claimed in claim 1, wherein, said sequence excavate comprise phase sorting, obtain the sport collection stage, translate phase, excavation phase, feature selecting stage.
7. the detection method of malicious code as claimed in claim 1, wherein, the feature selecting index comprises, is dimension with the malicious code kind, different behavioural characteristics appear at the mutual information and the mean square deviation of the probability in the variety classes.
8. malicious code detection system comprises:
Detecting module is used for smelling the data of visiting network, grasps the datagram of specific characteristic according to configurable rule;
The training sample pond is used to deposit initial sample, and said sample divides good type in advance by the expert;
The svm classifier device, this sorter can be classified to the data after the dynamic expansion according to the study to training sample automatically, and is responsible for classified information is loaded in the database;
Relational database; Be used for depositing the behavior sequence that corresponding heterogeneous networks that sample file extracts connects; Said relational database is supported the visit of malicious code behavior sequence burst program, and the behavior sequence behind the burst generates new fact table and dimension table through further clear and conversion; Leave in the relation data, supply olap analysis to use;
The OLAP engine is used for reading the fact table and the dimension table of relational database, sets up the stereo data pattern, confirms the statistical value as tolerance, in native system, is set at the number of times that specific behavior occurs, and confirms different dimensions, and the MDX on the Cube is provided query interface;
Characteristic sequence excavates engine, and it finds the pattern in the malicious code behavior sequence burst according to the Apriori improved algorithm, and the realization program of this algorithm can visit the metric among the Cube through the MDX query interface, finds frequent fast;
Knowledge base is used for depositing the malicious code network behavior characteristic sequence of discovery, and it is become rule tree as regular weaves, is convenient to inquiry.
9. malicious code detection system as claimed in claim 8, wherein, said sample pool can dynamically expand, and the data that real-time tracing specifies the malicious code main control end to send are found new malicious code mutation.
CN2011102901723A 2011-09-28 2011-09-28 Detecting method and system for malicious codes Pending CN102360408A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011102901723A CN102360408A (en) 2011-09-28 2011-09-28 Detecting method and system for malicious codes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011102901723A CN102360408A (en) 2011-09-28 2011-09-28 Detecting method and system for malicious codes

Publications (1)

Publication Number Publication Date
CN102360408A true CN102360408A (en) 2012-02-22

Family

ID=45585736

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011102901723A Pending CN102360408A (en) 2011-09-28 2011-09-28 Detecting method and system for malicious codes

Country Status (1)

Country Link
CN (1) CN102360408A (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102651088A (en) * 2012-04-09 2012-08-29 南京邮电大学 Classification method for malicious code based on A_Kohonen neural network
CN103368904A (en) * 2012-03-27 2013-10-23 百度在线网络技术(北京)有限公司 Mobile terminal, and system and method for suspicious behavior detection and judgment
CN103810427A (en) * 2014-02-20 2014-05-21 中国科学院信息工程研究所 Mining method and system for malicious code hiding behaviors
CN104077524A (en) * 2013-03-25 2014-10-01 腾讯科技(深圳)有限公司 Training method used for virus identification and virus identification method and device
CN104144142A (en) * 2013-05-07 2014-11-12 阿里巴巴集团控股有限公司 Web vulnerability discovery method and system
CN104702605A (en) * 2015-03-11 2015-06-10 国家计算机网络与信息安全管理中心 Malicious code identification method and device applied to businesses between internal and external networks
CN105306475A (en) * 2015-11-05 2016-02-03 天津理工大学 Network intrusion detection method based on association rule classification
CN106203117A (en) * 2016-07-12 2016-12-07 国家计算机网络与信息安全管理中心 A kind of malice mobile applications decision method based on machine learning
CN106600067A (en) * 2016-12-19 2017-04-26 广州视源电子科技股份有限公司 Method and device for optimizing multidimensional cube model
CN106649476A (en) * 2016-09-29 2017-05-10 北京中联网盟科技股份有限公司 IP address information query system
CN106663167A (en) * 2014-07-16 2017-05-10 微软技术许可有限责任公司 Recognition of behavioural changes of online services
CN106682515A (en) * 2016-12-15 2017-05-17 中国人民解放军国防科学技术大学 Method for measuring behavior competence during malicious code analysis
CN106709349A (en) * 2016-12-15 2017-05-24 中国人民解放军国防科学技术大学 Multi-dimension behavior characteristic-based malicious code classification method
CN107251513A (en) * 2014-11-25 2017-10-13 恩西洛有限公司 System and method for the accurate guarantee of Malicious Code Detection
CN107437027A (en) * 2017-07-28 2017-12-05 四川长虹电器股份有限公司 Malicious code quick search and the System and method for of detection
CN109698835A (en) * 2019-01-19 2019-04-30 郑州轻工业学院 A kind of encryption Trojan detecting method towards the hidden tunnel HTTPS
CN109784053A (en) * 2018-12-29 2019-05-21 360企业安全技术(珠海)有限公司 Generation method, device and storage medium, the electronic device of filtering rule
CN110008701A (en) * 2019-03-20 2019-07-12 北京大学 Static detection Rules extraction method and detection method based on ELF file characteristic
CN110781662A (en) * 2019-10-21 2020-02-11 腾讯科技(深圳)有限公司 Method for determining point-to-point mutual information and related equipment
CN110955895A (en) * 2019-11-29 2020-04-03 珠海豹趣科技有限公司 Operation interception method and device and computer readable storage medium
CN112257062A (en) * 2020-12-23 2021-01-22 北京金睛云华科技有限公司 Sandbox knowledge base generation method and device based on frequent item set mining
CN114065199A (en) * 2021-11-18 2022-02-18 山东省计算中心(国家超级计算济南中心) Cross-platform malicious code detection method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040153365A1 (en) * 2004-03-16 2004-08-05 Emergency 24, Inc. Method for detecting fraudulent internet traffic
CN101449284A (en) * 2006-03-20 2009-06-03 乔耳·贝尔曼 Scoring quality of traffic to network sites using interrelated traffic parameters

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040153365A1 (en) * 2004-03-16 2004-08-05 Emergency 24, Inc. Method for detecting fraudulent internet traffic
CN101449284A (en) * 2006-03-20 2009-06-03 乔耳·贝尔曼 Scoring quality of traffic to network sites using interrelated traffic parameters

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103368904B (en) * 2012-03-27 2016-12-28 百度在线网络技术(北京)有限公司 The detection of mobile terminal, questionable conduct and decision-making system and method
CN103368904A (en) * 2012-03-27 2013-10-23 百度在线网络技术(北京)有限公司 Mobile terminal, and system and method for suspicious behavior detection and judgment
CN102651088B (en) * 2012-04-09 2014-03-26 南京邮电大学 Classification method for malicious code based on A_Kohonen neural network
CN102651088A (en) * 2012-04-09 2012-08-29 南京邮电大学 Classification method for malicious code based on A_Kohonen neural network
CN104077524A (en) * 2013-03-25 2014-10-01 腾讯科技(深圳)有限公司 Training method used for virus identification and virus identification method and device
CN104077524B (en) * 2013-03-25 2018-01-09 腾讯科技(深圳)有限公司 Training method and viruses indentification method and device for viruses indentification
CN104144142A (en) * 2013-05-07 2014-11-12 阿里巴巴集团控股有限公司 Web vulnerability discovery method and system
CN103810427A (en) * 2014-02-20 2014-05-21 中国科学院信息工程研究所 Mining method and system for malicious code hiding behaviors
CN103810427B (en) * 2014-02-20 2016-09-21 中国科学院信息工程研究所 A kind of malicious code hidden behaviour method for digging and system
CN106663167B (en) * 2014-07-16 2020-03-27 微软技术许可有限责任公司 Identifying behavioral changes of an online service
CN106663167A (en) * 2014-07-16 2017-05-10 微软技术许可有限责任公司 Recognition of behavioural changes of online services
CN107251513A (en) * 2014-11-25 2017-10-13 恩西洛有限公司 System and method for the accurate guarantee of Malicious Code Detection
CN107251513B (en) * 2014-11-25 2020-06-09 恩西洛有限公司 System and method for accurate assurance of malicious code detection
CN104702605A (en) * 2015-03-11 2015-06-10 国家计算机网络与信息安全管理中心 Malicious code identification method and device applied to businesses between internal and external networks
CN105306475B (en) * 2015-11-05 2018-06-29 天津理工大学 A kind of network inbreak detection method based on Classification of Association Rules
CN105306475A (en) * 2015-11-05 2016-02-03 天津理工大学 Network intrusion detection method based on association rule classification
CN106203117A (en) * 2016-07-12 2016-12-07 国家计算机网络与信息安全管理中心 A kind of malice mobile applications decision method based on machine learning
CN106649476A (en) * 2016-09-29 2017-05-10 北京中联网盟科技股份有限公司 IP address information query system
CN106649476B (en) * 2016-09-29 2019-08-20 北京中联网盟科技有限公司 A kind of IP address information inquiry system
CN106709349A (en) * 2016-12-15 2017-05-24 中国人民解放军国防科学技术大学 Multi-dimension behavior characteristic-based malicious code classification method
CN106682515A (en) * 2016-12-15 2017-05-17 中国人民解放军国防科学技术大学 Method for measuring behavior competence during malicious code analysis
CN106682515B (en) * 2016-12-15 2019-10-18 中国人民解放军国防科学技术大学 The measure of capacity in malicious code analysis
CN106709349B (en) * 2016-12-15 2019-10-29 中国人民解放军国防科学技术大学 A kind of malicious code classification method based on various dimensions behavioural characteristic
CN106600067A (en) * 2016-12-19 2017-04-26 广州视源电子科技股份有限公司 Method and device for optimizing multidimensional cube model
CN106600067B (en) * 2016-12-19 2020-11-03 广州视源电子科技股份有限公司 Method and device for optimizing multidimensional cube model
CN107437027A (en) * 2017-07-28 2017-12-05 四川长虹电器股份有限公司 Malicious code quick search and the System and method for of detection
CN109784053A (en) * 2018-12-29 2019-05-21 360企业安全技术(珠海)有限公司 Generation method, device and storage medium, the electronic device of filtering rule
CN109698835A (en) * 2019-01-19 2019-04-30 郑州轻工业学院 A kind of encryption Trojan detecting method towards the hidden tunnel HTTPS
CN109698835B (en) * 2019-01-19 2021-03-26 郑州轻工业学院 Encrypted Trojan horse detection method facing HTTPS hidden tunnel
CN110008701A (en) * 2019-03-20 2019-07-12 北京大学 Static detection Rules extraction method and detection method based on ELF file characteristic
CN110008701B (en) * 2019-03-20 2020-11-03 北京大学 Static detection rule extraction method and detection method based on ELF file characteristics
CN110781662A (en) * 2019-10-21 2020-02-11 腾讯科技(深圳)有限公司 Method for determining point-to-point mutual information and related equipment
CN110955895A (en) * 2019-11-29 2020-04-03 珠海豹趣科技有限公司 Operation interception method and device and computer readable storage medium
CN110955895B (en) * 2019-11-29 2022-03-29 珠海豹趣科技有限公司 Operation interception method and device and computer readable storage medium
CN112257062A (en) * 2020-12-23 2021-01-22 北京金睛云华科技有限公司 Sandbox knowledge base generation method and device based on frequent item set mining
CN112257062B (en) * 2020-12-23 2021-04-16 北京金睛云华科技有限公司 Sandbox knowledge base generation method and device based on frequent item set mining
CN114065199A (en) * 2021-11-18 2022-02-18 山东省计算中心(国家超级计算济南中心) Cross-platform malicious code detection method and system

Similar Documents

Publication Publication Date Title
CN102360408A (en) Detecting method and system for malicious codes
Trinder et al. Plant ecology's guilty little secret: understanding the dynamics of plant competition
CN101582817B (en) Method for extracting network interactive behavioral pattern and analyzing similarity
CN106682012A (en) Commodity object information searching method and device
CN103729402A (en) Method for establishing mapping knowledge domain based on book catalogue
CN103927398A (en) Microblog hype group discovering method based on maximum frequent item set mining
CN101853277A (en) Vulnerability data mining method based on classification and association analysis
CN105893551A (en) Method and device for processing data and knowledge graph
CN101364239A (en) Method for auto constructing classified catalogue and relevant system
CN102122291A (en) Blog friend recommendation method based on tree log pattern analysis
CN109409647A (en) A kind of analysis method of the salary level influence factor based on random forests algorithm
CN114021168B (en) Subway foundation pit excavation risk identification method and device based on federal learning
Babu et al. Improving Quality of Content Based Image Retrieval with Graph Based Ranking
CN103268406A (en) Data mining system and method based on coal mine safety training games
CN112416976A (en) Distributed denial of service attack monitoring system and method based on distributed multi-level cooperation
CN103942198A (en) Method and device for mining intentions
CN109858025A (en) A kind of segmenting method and system of Address Standardization corpus
CN103020283B (en) A kind of semantic retrieving method of the dynamic restructuring based on background knowledge
CN116910283A (en) Graph storage method and system for network behavior data
CN109993390A (en) Alarm association and worksheet processing optimization method, device, equipment and medium
CN106802958A (en) Conversion method and system of the CAD data to GIS data
CN104156458B (en) The extracting method and device of a kind of information
CN109286622A (en) A kind of network inbreak detection method based on learning rules collection
CN106326746B (en) A kind of rogue program behavioural characteristic base construction method and device
CN102254034A (en) Online analytical processing (OLAP) query log mining and recommending method based on efficient mining of frequent closed sequences (BIDE)

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20120222