Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing of the present invention, technical scheme of the present invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, those of ordinary skills belong to the scope of protection of the invention not making the every other embodiment that obtains under the creative work prerequisite.
The present invention is applied to the theory of entropy in the abnormality detection of DNS for the first time, therefore at first entropy is once introduced.Entropy is defined as in information theory: if having an event sets E={E1 among the S of system, and E2 ..., En}, E1, E2 ..., En is each event among the event sets E.The probability distribution P={P1 of each event, P2 ..., Pn}, P1, P2 ..., the probability that Pn occurs for each event.The amount of information I of each event r itself
rCan be calculated by formula (1):
I
r=-log
2P
r (1)
In the formula (1), r=1,2 ..., n.
For example: English has 26 letters, if each letter occurrence number in article is average, each alphabetical amount of information is: I=-log
2(1/26)=4.7
And Chinese character commonly used have 2500, if each Chinese character occurrence number in article is average, the amount of information of each Chinese character is: I=-log
2(1/2500)=11.3
Entropy is the average information of whole system S, and establishing entropy is H
s, then the computational methods of entropy are as shown in Equation (2):
Entropy is represented the uncertainty of information in the information communication sphere.The entropy of high information degree is lower, illustrates that the systematic comparison of high information degree is stable; Whether and the entropy of low information degree is higher, and system's instability of low information degree is described, take place easily therefore can detect DNS by entropy and take place unusually unusually.
Embodiment 1
Fig. 1 detects the unusual method embodiment schematic flow sheet of DNS for the present invention, and as shown in Figure 1, this method comprises:
Step 101: DNS data query stream is divided into a plurality of data blocks;
Need to prove: the data block of division is more big, that is to say that the data query amount that each data block comprises is more many, the variation of the entropy of this data block is just more mild, can effectively reduce the situation that flase drop surveys and take place, but also reduced the susceptibility to abnormal flow simultaneously, loss rises; Otherwise data block is more little, that is to say that the data query amount that each data block comprises is more few, and it is just more high to detect the unusual sensitivity of DNS, but accuracy can reduce again accordingly.
In the practical application, DNS data query stream can be divided into a plurality of data blocks according to the fixed time and/or according to the given query amount.For example, the data query amount of each minute in the DNS data query stream can be divided into a data block, perhaps the inquiry amount with per 1000 query notes in the DNS data query stream is divided into a data block; Can also divide according to fixed time and given query amount simultaneously, for example, when reaching the fixed time, but be divided into a data block when not reaching the given query amount, perhaps reach the given query amount, but be divided into a data block when not reaching the fixed time.Can also divide according to the time period function, such as, the morning 8:30 between the 12:00, data block can be divided according to the less time period, for example: divide a data block second every 20-30; At noon 12:00 to afternoon 1:00 data block can be divided according to the long time period, for example: divided a data block every 2-3 minute.This division can be adjusted according to actual conditions by the technical staff, perhaps comes the dividing data piece according to the size of experience and data query amount.
Step 102: according to presetting the entropy that querying attributes calculates a plurality of data blocks, obtain corresponding a plurality of entropy;
Wherein, default querying attributes comprises that situation appears in the query source IP that occurs in the type of error that occurs in query type, the inquiry, the inquiry or the nslookup in the inquiry, but be not limited to these querying attributes, so long as all can according to the querying attributes of certain category division.
Above-mentioned query type comprises at least: the IP address record (Address of domain name correspondence, abbreviation A), the address record AAAA of IPv6 main frame, reverse record (Pointer, abbreviation PTR), mail exchange record (Mail exchanger, abbreviation MX), name server record (Name Server, abbreviation NS), initial authorized organization record (Start Of Authority is called for short SOA).
The type of error that occurs in the inquiry refers to: comprise illegal field in the DNS query requests of transmission, main type of error comprises: the query source address is the name format mistake that comprises illegal character, inquiry in TLD that privately owned address, query type do not exist, the inquire abouts name that do not exist, inquire about, repeat to inquire about or normal queries class etc.Wherein, normal queries refers to not have wrong inquiry, can work as default querying attributes when being type of error, will not have wrong inquiry to be included in the normal queries class, makes every query note can be included into specifically in certain type.
According to presetting the entropy that querying attributes calculates a plurality of data blocks, be specially:
The probability that each element of the default querying attributes of calculating occurs in each data block;
According to the probability that each element of presetting querying attributes occurs, calculate the entropy of each data block in each data block.
When having overlapped part between a plurality of data blocks of dividing, for example, Fig. 2 is the schematic diagram according to fixed time dividing data piece, as shown in Figure 2, the inquiry amount between the 8:00 to 8:10 is a data block, and the inquiry amount between the 8:03 to 8:13 is a data block, divide a data block in namely 10 minutes, arranged between each data block 3 minutes overlapping time, like this data query stream is divided into a plurality of overlapping data blocks that have.Present embodiment comprises that with each data block of dividing the given query amount is that example is elaborated.
If the given query amount that each data block comprises is 10 query notes, current data block is i data block, the last data piece adjacent with current data block is i-1 data block, a back data block adjacent with current data block is i+1 data block, if i-1 data block comprises the 1st to the 10th query note, then i data block comprises the 2nd query note to Sub_clause 11, and i+1 data block comprises the 3rd to the 12nd query note.The inquiry amount of i-1 data block and i data block lap is the 2nd to the 10th query note, the inquiry amount of i data block and i+1 data block lap be the 3rd to the Sub_clause 11 query note.
When having overlapped part between a plurality of data blocks of dividing, according to presetting the entropy that querying attributes calculates a plurality of data blocks, can comprise:
Calculate the entropy H of the last data piece adjacent with current data block
1
Entropy H according to the last data piece adjacent with current data block
1, the entropy H of calculating current data block
2
Entropy H according to the last data piece adjacent with current data block
1, the entropy H of calculating current data block
2, be specially:
Calculate the first given query amount and the second given query amount weighted information amount T in i-1 data block respectively
fAnd T
lThe first given query amount refers to before i data block and i-1 the data block lap the not inquiry amount of lap; The second given query amount refers to behind i data block and i+1 the data block lap the not inquiry amount of lap;
Continue above-mentioned example, the first given query amount refers to the 1st query note, and the second given query amount refers to the 12nd query note.
Article 1, the probability that occurs in i-1 data block of the query type under the query note is P
f, T then
f=-P
fLog
2P
f
Article 12, the probability that occurs in i-1 data block of the query type under the query note is P
l, T then
l=-P
lLog
2P
l
Calculate the second given query amount and the 3rd given query amount weighted information amount in i data block respectively
With
The 3rd given query amount refers to before i data block and i+1 the data block lap the not inquiry amount of lap;
Continue above-mentioned example, the probability that the query type under the 12nd query note occurs in i data block is
Then
The 3rd given query amount refers to the 2nd query note, and then the probability that occurs in i data block of the query type under the 2nd query note is
Then
Entropy H according to the i-1 data block
1, T
f, T
l,
With
Calculate the entropy H of i data block
2, namely
Wherein, when i is 2, when namely the last data piece adjacent with current data block is for first data block of dividing, calculate the probability that each element of default querying attributes occurs in first data block;
Entropy H according to above-mentioned first data block of probability calculation
1
For example, if default querying attributes is query type, then the element in the query type is concrete query type, and as above-mentioned A, AAAA, PTR, MX, NS, SOA etc., each bar query note can only belong to a query type.Can calculate the probability that the query type under each bar query note occurs in this data block in this data block, the probability that occurs in this data block according to the query type under each bar query note calculates the entropy of this data block then, and computing formula is
In the formula (3), H
kBe the entropy of each data block, j represents j bar query note in each data block, and n represents that n bar query note, p are arranged in each data block
jThe probability that in this data block, occurs for the query type under the j bar query note in each data block;
When default querying attributes was query source IP, the element among the query source IP was the IP address of each bar query note correspondence.Because each the bar query note in each data block can only be from an IP address, then can calculate the probability that the IP address of each bar query note in the data block occurs in this data block, the probability that occurs in this data block according to the IP address of each bar query note calculates the entropy of this data block then.
Need to prove: default querying attributes can also comprise two or more simultaneously, for example, when default querying attributes comprises query type and query source IP, can calculate the entropy of each data block according to these two kinds of querying attributes respectively, two entropy weighting summations that will calculate respectively according to query type and query source IP then are with the result of the weighting summation that the obtains final entropy as this data block.
Step 103: judge that the entropy whether default number is arranged in the above-mentioned a plurality of entropy that obtain surpasses predetermined threshold value, if determine that then DNS has taken place unusually.
If it is 5 that default number is set, if then have 5 entropy all to surpass predetermined threshold value in a plurality of entropy that step 102 obtains, determine that then this DNS has taken place unusually.Default number also can be set to 1,2 etc. other numbers.The precision that how much can influence testing result of default number, default number is more big, and the accuracy of detection that obtains is more high, but loss also rises simultaneously.Default number is more little, and accuracy of detection is more low, and loss also reduces simultaneously, and the selection of default number need be determined according to actual network conditions and experience.
The DNS data query can be historical DNS data query in the present embodiment, also can be real-time DNS data query.If the DNS data query is historical DNS data query, then the method that provides of present embodiment can be used for the DNS operating position is analyzed, and analysis result can be used for carrying out DNS and optimize; The present embodiment more applications is in the scene that detects in real time, and namely the DNS data query is real-time DNS data query, is used in time finding unusual among the DNS, avoids DNS to sustain losses severely.
In order better to embody effect of the present invention, can Chinese the Internet occurrence of large-area suspension on May 19th, 2009 accident be that example describes.The reason of occurrence of large-area suspension accident is exactly that the DNS system has been subjected to attack, according to from China (China, being called for short CN) query note of 19 days Mays in 2009 between the 9:00-24:00 that collect on the DNS authoritative server of certain top node make a concrete analysis of, query note between on May 19th, 2009 9:00-24:00 is divided into a plurality of data blocks, the size of each data block is 10000, be that each data block comprises 10000 query notes, calculate the entropy of each data block, a plurality of entropy that obtain are plotted as the entropy curve.Fig. 3 is the entropy curve that obtained in 10000 o'clock for the data block size, and Fig. 4 is DNS inquiry rate curve, and the inquiry rate is the inquiry times of per minute.As can be seen from Figure 3, big ups and downs have appearred in 16:00 left and right sides entropy curve, namely have a plurality of entropy all to surpass predetermined threshold value, show at this time to have begun to have a large amount of DNS abnormal flows to enter network, and namely DNS has taken place unusually; And in inquiry rate curve shown in Figure 4,18:30 left and right sides query flows just presents significantly unusual, but large tracts of land suspension this moment has begun to take place, and therefore can find out obviously that prior art has hysteresis quality and very high loss based on the detection scheme of query flows; The unusual method of detection provided by the invention DNS can detect unusual among the DNS in advance timely, has played the effect of early warning.
The present invention is by being divided into a plurality of data blocks with DNS data query stream, calculate the entropy of a plurality of data blocks according to default querying attributes, obtain corresponding a plurality of entropy, when the entropy that default number is arranged in these a plurality of entropy surpasses predetermined threshold value, determine that DNS has taken place unusually.Because entropy is tolerance to the querying attributes random distribution of DNS data query, when DNS takes place when unusual, for example, when DNS was subjected to attack, the random distribution of the querying attributes of DNS data query will change, thereby also can cause entropy to change.Just can learn that according to the situation of change of entropy DNS has taken place unusually, and the taking place when unusual at DNS based on the detection method of flow of prior art, when the unusual performance of DNS is not clearly the time, variation clearly can not take place in the query flows of DNS yet, thereby also just can not detect the DNS generation unusually, have only when DNS shows very seriously unusually, the network paralysis of occurrence of large-area for example, when causing a large number of users to use network, the detection method based on flow of prior art just can detect the DNS Traffic Anomaly, and then detect the DNS generation unusually, have tangible hysteresis quality; And the present invention can just can detect DNS and taken place unusually before the abnormal conditions serious as large tracts of land network failure etc. take place, can forewarning function take place to play unusually to DNS, the user can be got ready before DNS is serious unusually, the loss of having avoided serious DNS to bring to the user unusually, reduce loss, improved user's experience; And because DNS is an extremely complicated system, prior art determines based on the variation of querying attributes value whether DNS takes place when unusual, do not consider the state variation of DNS internal system complexity, thereby accuracy of detection is not high, and among the further embodiment of the present invention, when having lap between a plurality of data blocks of dividing, also reflected the variation of DNS internal system state between a plurality of entropy that obtain, make accuracy of detection improve greatly.
Embodiment 2
Fig. 5 detects the unusual device embodiment schematic diagram of DNS for the present invention, and as shown in Figure 5, this device comprises: divide module 201, computing module 202 and judge module 203;
Wherein, divide module 201, be used for DNS data query stream is divided into a plurality of data blocks;
Concrete, divide module 201 and be used for DNS data query stream is divided into a plurality of data blocks according to the fixed time and/or according to the given query amount.
Computing module 202, the entropy for calculate a plurality of data blocks of dividing module 201 divisions according to default querying attributes obtains corresponding a plurality of entropy;
Wherein, computing module 202 comprises first computing unit and second computing unit;
First computing unit is used for calculating and presets the probability that each element of querying attributes occurs in each data block;
Second computing unit is used for the probability that each element of the default querying attributes that obtains according to first computing unit occurs in each data block, calculates the entropy of dividing a plurality of data blocks that module 201 divides, obtains corresponding a plurality of entropy.
When having overlapped part between a plurality of data blocks of dividing module 201 divisions, computing module 202 comprises:
The 3rd computing unit is for the entropy H that calculates the last data piece adjacent with current data block
1
The 4th computing unit is used for the H according to the last data piece adjacent with current data block of the 3rd computing unit calculating
1, the entropy H of calculating current data block
2
Wherein, the 3rd computing unit comprises:
First computation subunit is used for when the above-mentioned last data piece adjacent with current data block is first data block that divides into, the probability that each element of the default querying attributes of calculating occurs in first data block;
Second computation subunit is used for each element of the default querying attributes of basis at the probability that first data block occurs, and calculates the entropy H of first data block
1
Judge module 203 is used for judging whether a plurality of entropy that computing module 202 obtains have the entropy of default number to surpass predetermined threshold value, if then unusual information takes place output expression DNS.
Need to prove: for detecting unusual device first embodiment of DNS, because it is substantially corresponding to method first embodiment, so relevant part gets final product referring to the part explanation of method first embodiment.
The present invention is by being divided into a plurality of data blocks with DNS data query stream, calculate the entropy of a plurality of data blocks according to default querying attributes, obtain the entropy of a plurality of correspondences, when the entropy that default number is arranged in these a plurality of entropy surpasses predetermined threshold value, determine that DNS has taken place unusually.Because entropy is tolerance to the querying attributes random distribution of DNS data query, when DNS takes place when unusual, for example, when DNS was subjected to attack, the random distribution of the querying attributes of DNS data query will change, thereby also can cause entropy to change.Just can learn that according to the situation of change of entropy DNS has taken place unusually, and the taking place when unusual at DNS based on the detection method of flow of prior art, when the unusual performance of DNS is not clearly the time, variation clearly can not take place in the query flows of DNS yet, thereby also just can not detect the DNS generation unusually, have only when DNS shows very seriously unusually, the network paralysis of occurrence of large-area for example, when causing a large number of users to use network, the detection method based on flow of prior art just can detect the DNS Traffic Anomaly, and then detect the DNS generation unusually, have tangible hysteresis quality; And the present invention can just can detect DNS and taken place unusually before the abnormal conditions serious as large tracts of land network failure etc. take place, can forewarning function take place to play unusually to DNS, the user can be got ready before DNS is serious unusually, the loss of having avoided serious DNS to bring to the user unusually, reduce loss, improved user's experience; And because DNS is an extremely complicated system, prior art determines based on the variation of querying attributes value whether DNS takes place when unusual, do not consider the state variation of DNS internal system complexity, thereby accuracy of detection is not high, and among the further embodiment of the present invention, when having lap between a plurality of data blocks of dividing, also reflected the variation of DNS internal system state between a plurality of entropy that obtain, make accuracy of detection improve greatly.
One of ordinary skill in the art will appreciate that: all or part of step that realizes said method embodiment can be finished by the relevant hardware of program command, aforesaid program can be stored in the computer read/write memory medium, this program is carried out the step that comprises said method embodiment when carrying out; And aforesaid storage medium comprises: various media that can be program code stored such as ROM, RAM, magnetic disc or CD.
It should be noted that at last: above embodiment only in order to technical scheme of the present invention to be described, is not intended to limit; Although with reference to previous embodiment the present invention is had been described in detail, those of ordinary skill in the art is to be understood that: it still can be made amendment to the technical scheme that aforementioned each embodiment puts down in writing, and perhaps part technical characterictic wherein is equal to replacement; And these modifications or replacement do not make the essence of appropriate technical solution break away from the spirit and scope of various embodiments of the present invention technical scheme.