The content of the invention
The embodiment of the present application provides a kind of abnormal access detection method and device, to detect there is abnormal visit
The URL asked.
The embodiment of the present application provides a kind of abnormal access detection method, including:
According to the out-degree and in-degree of the first uniform resource position mark URL, bypassing for the first URL is determined
Rate;Wherein, the out-degree of the first URL refers under the first URL to the first URL
Swim URL access number, the in-degree of the first URL refer to from the upstream URL of the first URL to
The access number of first URL;First URL's bypasses the downstream that rate reacts the first URL
Situation directly accessed without the first URL URL;
If the rate that bypasses of the first URL bypasses rate threshold value more than setting, judge the first URL's
Whether downstream URL is the URL accessed by normal poll;If the downstream URL is to be accessed by normal poll
URL, it is determined that the downstream URL be not present abnormal access, if the downstream URL is not normal
The URL that poll is accessed, it is determined that the downstream URL has abnormal access.
Alternatively, whether the downstream URL for judging the first URL is the URL accessed by normal poll,
Including:
Multiple internet protocol address according to the downstream URL is accessed are distinguished between corresponding access time
Every determining the corresponding mean access time intervals of the downstream URL;
If the mean access time interval is less than setting duration, it is determined that the downstream URL is not by just
The URL that normal poll is accessed;
If the mean access time interval is more than or equal to setting duration, it is determined that the multiple IP address
The standard deviation at the corresponding access time interval of difference;If the standard deviation is more than established standardses difference limen value, really
The fixed downstream URL is not the URL accessed by normal poll;If the standard deviation is less than or equal to setting
Standard deviation threshold method, it is determined that the downstream URL is the URL accessed by normal poll.
Alternatively, determine that each IP address difference in the multiple IP address is corresponding according to following steps
Access time interval:
For each IP address in the multiple IP address, by the corresponding multiple access times of the IP address
Mode in interval is defined as the corresponding access time interval of the IP address.
Alternatively, selected to access multiple IP address of the downstream URL according to following steps:
From the access downstream URL of record IP address, choose access times and be more than first threshold,
And less than the IP address of Second Threshold.
Alternatively, determine that the first URL's bypasses rate μ according to below equation:
μ=(λ 1- λ 2)/λ 2
Wherein, λ 1 is the out-degree of the first URL, and λ 2 is the in-degree of the first URL.
The embodiment of the present application provides a kind of abnormal access detection means, including:
Rate determining module is bypassed, for the out-degree and in-degree according to the first uniform resource position mark URL, really
Fixed first URL's bypasses rate;Wherein, the out-degree of the first URL refers to from the first URL
To the downstream URL of the first URL access number, the in-degree of the first URL refers to from described
Access numbers of the one URL upstream URL to the first URL;First URL's bypasses rate reaction
Situation directly accessed without the first URL downstream URL of first URL;
Abnormal access determining module, if the rate that bypasses for the first URL bypasses rate threshold value more than setting,
Whether the downstream URL for then judging the first URL is the URL accessed by normal poll;If the downstream
URL is the URL accessed by normal poll, it is determined that abnormal access is not present in the downstream URL, if institute
It is not the URL accessed by normal poll to state downstream URL, it is determined that the downstream URL has abnormal access.
The embodiment of the present application determines that the first URL's bypasses rate according to the first URL out-degree and in-degree;
If the first URL rate that bypasses bypasses rate threshold value more than setting, the first URL downstream URL is judged
Whether it is the URL accessed by normal poll;If downstream URL is the URL accessed by normal poll,
Determine that abnormal access is not present in downstream URL, if downstream URL is not the URL accessed by normal poll,
Then determine that downstream URL has abnormal access.In the embodiment of the present application, if URL's bypasses rate
Rate threshold value is bypassed more than setting, the downstream URL of the URL is not again the URL accessed by normal poll, then
Illustrate that downstream URL has abnormal access.The application bypasses normal service logic for malicious user,
The URL of long access is carried out using multiple IP address, or bypasses normal service logic, is quickly being visited
Ask and the URL that IP address conducts interviews just is changed after hundreds of times, can be come out with effective detection.
Embodiment
Assuming that a service link includes three URL:A, B, C, link, which is called, is sequentially:A→B→
C.It can be potentially encountered failure, user because of situations such as service link is accessed and actively exit and can not complete, institute
With generally A call number >=B call number >=C call number.Assuming that there is B tune
With the situation of number of times >=A call number >=C call number, then this URL for largely being accessed of B
Be probably be bypassed can only be by entrance of A regular traffic logic.Based on this, for a service link,
The embodiment of the present application is primarily based on URL and bypasses rate to judge whether the downstream URL of the URL is possible to
There is abnormal access:If the URL rate that bypasses bypasses rate threshold value more than setting, under the URL
Trip URL there may exist abnormal access, now further judge downstream URL whether by normal rounds
The URL accessed is ask, if it is not, then illustrating that downstream URL has abnormal access.The application is for disliking
Meaning user bypasses normal service logic, and the URL, Huo Zhe of long access are carried out using multiple IP address
Quickly access and the URL that IP address conducts interviews just is changed after hundreds of times, can be come out with effective detection.
The embodiment of the present application is described in further detail with reference to Figure of description.
As shown in figure 1, the abnormal access detection method flow chart provided for the embodiment of the present application, including it is following
Step:
S101:According to the first URL out-degree and in-degree, determine that the first URL's bypasses rate.
, will be from the URL to the URL for a URL in a service link in specific implementation
Downstream URL access number be defined as the out-degree of the URL, by from the upstream URL of the URL to this
URL access number is defined as the in-degree of the URL, and the URL's bypasses the downstream URL that rate reacts the URL
The situation directly accessed without the URL.
As shown in Fig. 2 in a service link, only downstream URL does not have upstream URL URL
For starting URL (URL1 in such as Fig. 2), existing upstream URL has during downstream URL URL is again
Between URL (URL2 in such as Fig. 2), only upstream URL do not have downstream URL URL be leaf
URL (URL3 in such as Fig. 2).One URL upstream URL out-degree is the in-degree of the URL,
URL in-degree is the out-degree of the URL downstream.The out-degree for originating URL is more than 0, and in-degree is 0, such as
The out-degree of URL1 in Fig. 2 is 1000, and in-degree is 0.Leaf URL in-degree is more than 0, and out-degree is 0,
In-degree such as URL3 in Fig. 2 is 100000.Middle URL in-degree and out-degree is both greater than 0, such as Fig. 2
Middle URL2 in-degree is 1000, and out-degree is 100000.
In specific implementation, according to URL out-degree and in-degree, the rate that bypasses of the URL, URL are determined
The rate that bypasses react the downstream URL of the URL situation directly accessed without the URL, therefore,
The difference of out-degree that can be based on the URL and in-degree determines that this bypasses rate.Such as, bypassing rate can embody
For the ratio of out-degree and in-degree, in this case, the starting URL rate that bypasses is infinity, leaf URL
Bypass rate for 0.For another example, the ratio that rate can be presented as between out-degree and the difference and in-degree of in-degree is bypassed
Value, that is, URL's bypasses rate μ coincidence formulas μ=(λ 1- λ 2)/λ 2, wherein, λ 1 is the URL
Out-degree, λ 2 be the URL in-degree, now, starting URL bypass rate for infinity, leaf URL
Bypass rate for -1.
In the embodiment of the present application, the first URL is centre URL, the first URL downstream URL
May be centre URL, it is also possible to be leaf URL.
S102:The first URL rate that bypasses is bypassed rate threshold value with setting and is compared;If the first URL
The rate that bypasses bypass rate threshold value less than or equal to setting, then into S104, that is, determine under the first URL
Swim URL and abnormal access is not present.
In specific implementation, the URL relatively low for bypassing rate, it is believed that the downstream URL of the URL
In the absence of abnormal access.
S103:If the first URL rate that bypasses bypasses rate threshold value more than setting, the first URL is judged
Downstream URL whether be the URL accessed by normal poll;If downstream URL is to be accessed by normal poll
URL, then into S104, that is, determine downstream URL be not present abnormal access, if downstream URL
It is not the URL accessed by normal poll, then into S105, that is, determines that downstream URL has abnormal access.
In specific implementation, if the first URL rate that bypasses bypasses rate threshold value more than setting, there are two kinds
Situation:
1st, the first URL downstream URL is bypassed regular traffic logic and largely accessed, namely exists
Abnormal access.
2nd, the first URL downstream URL is a URL accessed by normal poll.
Here, first URL higher for bypassing rate, if the first URL downstream URL is a quilt
The URL that normal poll is accessed, the then direct access for bypassing the first URL carried out to downstream URL
Belong to normal access.Such as, if user rests on the page (the such as mailbox of some display unread message always
The inbox page), system can every 5s carry out a unread message refreshing, it is clear that in this case
Autopolling access belong to normal access.
Because the URL accessed by normal poll access time interval is typically all what system was set, therefore
This URL generally has an obvious feature:The access time interval of different IP addresses is relatively fixed.
Such as, it is typically all what is be read out according to fixed time interval that system, which reads unread message,.Therefore, may be used
Judge whether the URL is to be visited by normal poll to access URL time interval according to different IP addresses
The URL asked.
As a kind of embodiment, whether the downstream URL for judging the first URL is to be visited by normal poll
The URL asked specific implementation step is:
S103a:Accessed according to the multiple IP address for the downstream URL for accessing the first URL difference is corresponding
Time interval, determines the corresponding mean access time intervals of downstream URL.
In actually implementing, there is a situation where that multiple users share same IP address in enterprise, in these feelings
Under condition, the number of times that a URL is accessed by same IP address is generally very more (usually more than 1000 times);
In addition, the contribution degree that the less IP address of access times is accessed for identification poll is also smaller.Based on this,
The embodiment of the present application is in order to further improve the discrimination accessed normal poll, to accessing downstream URL
IP address screened, specifically, from the access downstream URL of record IP address, choose visit
Ask that number of times is more than first threshold (such as 10 times), and it is multiple less than Second Threshold (such as 1000 times)
IP address, will use the user of these IP address temporarily as normal personal user;It is then based on selection
Multiple IP address, whether judge downstream URL is the URL accessed by normal poll.
, can be corresponding multiple by the IP address for each IP address filtered out in specific implementation
Mode in access time interval is defined as the corresponding access time interval of the IP address.Here mode
I.e. in multiple access time intervals of the IP address of record, the most access time interval of occurrence number,
If being separated with multiple between the most access time of occurrence number, it can select one of to be used as the IP address
Corresponding access time interval.
S103b:The mean access time interval is compared with setting duration;If the average access
Time interval is less than setting duration, then into S103e, that is, determining downstream URL is visited by normal poll
The URL asked.
In specific implementation, it is possible to can have malicious user using identical or different IP address very short
Time (such as 1s) in have accessed that hundreds of time (malicious user is intensive to be accessed hundreds of times to downstream URL
IP address is intercepted or changes afterwards to continue to access), if being only accurate to 1s to the record at access time interval,
The access time of the IP address then recorded is at intervals of 0.In this case, the access of record downstream
Although (being all 0) is relatively fixed at the access time interval of URL IP address, it is apparent that being not belonging to poll visit
Ask.Therefore, the mean access time interval of corresponding multiple IP address is less than setting by the embodiment of the present application
The URL of duration (such as 1s), is directly classified as the URL that there is abnormal access.
S103c:If the mean access time interval is more than or equal to setting duration, it is determined that the multiple
IP address distinguishes the standard deviation at corresponding access time interval.
Here, the standard deviation sigma at multiple IP address of selection corresponding access time interval respectively meets following public affairs
Formula:
Wherein, xiFor the access time interval of i-th of IP address, μ is that multiple IP address of selection are corresponding
Mean access time interval, N is the number of multiple IP address of selection.
S103d:The standard deviation is compared with established standardses difference limen value, set if the standard deviation is more than
Determine standard deviation threshold method (such as 1000), then into S103e, that is, it is not by normal rounds to determine downstream URL
Ask the URL accessed.If the standard deviation is less than or equal to established standardses difference limen value, into S103f, i.e.,
It is the URL accessed by normal poll to determine downstream URL.
Based on same inventive concept, a kind of and abnormal access detection method pair is additionally provided in the embodiment of the present application
The abnormal access detection means answered, because the device solves the principle and the embodiment of the present application abnormal access of problem
Detection method is similar, therefore the implementation of the device may refer to the implementation of method, repeats part and repeats no more.
As shown in figure 3, the abnormal access structure of the detecting device schematic diagram provided for the embodiment of the present application, including:
Rate determining module 31 is bypassed, for the out-degree and in-degree according to the first uniform resource position mark URL,
Determine that the first URL's bypasses rate;Wherein, the out-degree of the first URL refers to from described first
URL to the downstream URL of the first URL access number, the in-degree of the first URL refers to from institute
The first URL upstream URL is stated to the access number of the first URL;First URL's bypasses rate
React the downstream URL of the first URL situations directly accessed without the first URL.
Abnormal access determining module 32, if the rate that bypasses for the first URL bypasses rate threshold more than setting
Value, then whether the downstream URL for judging the first URL is the URL accessed by normal poll;If described
Downstream URL is the URL accessed by normal poll, it is determined that abnormal access is not present in the downstream URL,
If the downstream URL is not the URL accessed by normal poll, it is determined that the downstream URL exists abnormal
Access.
Alternatively, the abnormal access determining module 32 specifically for:
Multiple internet protocol address according to the downstream URL is accessed are distinguished between corresponding access time
Every determining the corresponding mean access time intervals of the downstream URL;If the mean access time interval
Less than setting duration, it is determined that the downstream URL is not the URL accessed by normal poll;If described flat
Equal access time interval is more than or equal to setting duration, it is determined that the multiple IP address difference is corresponding to visit
Ask the standard deviation of time interval;If the standard deviation is more than established standardses difference limen value, it is determined that the downstream
URL is not the URL accessed by normal poll;If the standard deviation is less than or equal to established standardses difference limen value,
It is the URL accessed by normal poll then to determine the downstream URL.
Alternatively, the abnormal access determining module 32 is the multiple specifically for being determined according to following steps
Each IP address in IP address distinguishes corresponding access time interval:
For each IP address in the multiple IP address, by the corresponding multiple access times of the IP address
Mode in interval is defined as the corresponding access time interval of the IP address.
Alternatively, the abnormal access determining module 32 is specifically for according to following steps selection access
Downstream URL multiple IP address:
From the access downstream URL of record IP address, choose access times and be more than first threshold,
And less than the IP address of Second Threshold.
Alternatively, the rate determining module 31 that bypasses according to below equation specifically for determining described first
URL's bypasses rate μ:
μ=(λ 1- λ 2)/λ 2
Wherein, λ 1 is the out-degree of the first URL, and λ 2 is the in-degree of the first URL.
It should be understood by those skilled in the art that, embodiments herein can be provided as method, system or meter
Calculation machine program product.Therefore, the application can be using complete hardware embodiment, complete software embodiment or knot
The form of embodiment in terms of conjunction software and hardware.Wherein wrapped one or more moreover, the application can be used
Containing computer usable program code computer-usable storage medium (include but is not limited to magnetic disk storage,
CD-ROM, optical memory etc.) on the form of computer program product implemented.
The application is produced with reference to according to the method for the embodiment of the present application, device (system) and computer program
The flow chart and/or block diagram of product is described.It should be understood that can be realized by computer program instructions flow chart and
/ or each flow and/or square frame in block diagram and the flow in flow chart and/or block diagram and/
Or the combination of square frame.These computer program instructions can be provided to all-purpose computer, special-purpose computer, insertion
Formula processor or the processor of other programmable data processing devices are to produce a machine so that pass through and calculate
The instruction of the computing device of machine or other programmable data processing devices is produced for realizing in flow chart one
The device for the function of being specified in individual flow or multiple flows and/or one square frame of block diagram or multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide computer or the processing of other programmable datas to set
In the standby computer-readable memory worked in a specific way so that be stored in the computer-readable memory
Instruction produce include the manufacture of command device, the command device realization in one flow or multiple of flow chart
The function of being specified in one square frame of flow and/or block diagram or multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices, made
Obtain and perform series of operation steps on computer or other programmable devices to produce computer implemented place
Reason, so that the instruction performed on computer or other programmable devices is provided for realizing in flow chart one
The step of function of being specified in flow or multiple flows and/or one square frame of block diagram or multiple square frames.
Although having been described for the preferred embodiment of the application, those skilled in the art once know base
This creative concept, then can make other change and modification to these embodiments.So, appended right will
Ask and be intended to be construed to include preferred embodiment and fall into having altered and changing for the application scope.
Obviously, those skilled in the art can carry out various changes and modification without departing from this Shen to the application
Spirit and scope please.So, if the application these modification and modification belong to the application claim and
Within the scope of its equivalent technologies, then the application is also intended to comprising including these changes and modification.