CN107426136B - Network attack identification method and device - Google Patents

Network attack identification method and device Download PDF

Info

Publication number
CN107426136B
CN107426136B CN201610345863.1A CN201610345863A CN107426136B CN 107426136 B CN107426136 B CN 107426136B CN 201610345863 A CN201610345863 A CN 201610345863A CN 107426136 B CN107426136 B CN 107426136B
Authority
CN
China
Prior art keywords
page
sub
access
current
probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610345863.1A
Other languages
Chinese (zh)
Other versions
CN107426136A (en
Inventor
任杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610345863.1A priority Critical patent/CN107426136B/en
Publication of CN107426136A publication Critical patent/CN107426136A/en
Application granted granted Critical
Publication of CN107426136B publication Critical patent/CN107426136B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Computer And Data Communications (AREA)

Abstract

The embodiment of the invention discloses a network attack identification method and a network attack identification device, wherein the method comprises the following steps: when the access flow state is an abnormal state, acquiring a current access data set containing current access data associated with a target page, and calculating current associated access probabilities respectively corresponding to the target page and each sub-page in a plurality of sub-pages according to a Bayesian formula and the current access data set; comparing the current associated access probability respectively corresponding between the target page and each sub-page with the historical associated access probability respectively corresponding between the target page and each sub-page; and when the difference value between the current associated access probability corresponding to the target sub-page and the historical associated access probability in the plurality of sub-pages exceeds a preset numerical threshold, determining that the user IP address in the current access data set is an illegal IP address. The invention can ensure the accuracy of identifying the network attack and can reduce the labor cost.

Description

Network attack identification method and device
Technical Field
The invention relates to the technical field of internet, in particular to a network attack identification method and device.
Background
With the popularization of the internet, more and more network attacks, such as CC (challenge blackhole) attacks, mainly used for attacking pages, appear in the internet, wherein the principle of the CC attacks means that an attacker controls some hosts to continuously send a large number of data packets to a server of an opposite side, so that server resources are exhausted until the hosts crash. In order to better prevent CC attack, CC attack with certain imperceptibility needs to be accurately identified.
Currently, the method for identifying CC attack may be: and the administrator selects a corresponding Web log according to the log time attribute and opens the selected Web log for analysis so as to determine whether the Web is attacked by the CC. Therefore, the existing identification method depends on manual analysis, when a large number of Web logs exist, the labor cost is greatly increased, and the accuracy of the manual analysis cannot be guaranteed all the time.
Disclosure of Invention
The embodiment of the invention provides a network attack identification method and device, which can ensure the identification accuracy of network attacks and can reduce the labor cost.
The embodiment of the invention provides a network attack identification method, which comprises the following steps:
when the access flow state is an abnormal state, acquiring a current access data set containing current access data associated with a target page, and calculating current associated access probabilities respectively corresponding to the target page and each sub-page in a plurality of sub-pages according to a Bayesian formula and the current access data set;
comparing the current associated access probability respectively corresponding to the target page and each sub-page with the historical associated access probability respectively corresponding to the target page and each sub-page; the historical associated access probability is calculated according to the Bayesian formula and a historical access data set acquired when the access flow state is a normal state;
and when the difference value between the current associated access probability corresponding to the target sub-page and the historical associated access probability in the plurality of sub-pages exceeds a preset numerical threshold, determining that the user IP address in the current access data set is an illegal IP address.
Correspondingly, the embodiment of the invention also provides a device for identifying network attacks, which comprises:
the acquisition and calculation module is used for acquiring a current access data set containing current access data associated with a target page when the access flow state is an abnormal state, and calculating current associated access probabilities respectively corresponding to the target page and each sub-page in the sub-pages according to a Bayesian formula and the current access data set;
a comparison module, configured to compare current associated access probabilities respectively corresponding to the target page and the sub-pages with historical associated access probabilities respectively corresponding to the target page and the sub-pages; the historical associated access probability is calculated according to the Bayesian formula and a historical access data set acquired when the access flow state is a normal state;
a determining module, configured to determine that a user IP address in the current access data set is an illegal IP address when a difference between the current associated access probability corresponding to a target sub-page and the historical associated access probability in the plurality of sub-pages exceeds a preset numerical threshold.
When the access flow state is an abnormal state, calculating current associated access probabilities respectively corresponding to a target page and each sub-page in a plurality of sub-pages, comparing the current associated access probabilities respectively corresponding to the target page and each sub-page with historical associated access probabilities respectively corresponding to the target page and each sub-page, and determining that a user IP address in a current access data set is an illegal IP address when a difference value between the current associated access probability corresponding to the target sub-page and the historical associated access probability exceeds a preset value threshold value in the plurality of sub-pages; therefore, the invention does not rely on the labor cost any more, thereby greatly reducing the labor cost, and the invention calculates the current association access probability and the historical association access probability based on the Bayesian formula, thereby accurately analyzing the page access habit of the user, and determining whether the network attack exists at present according to the change of the page access habit of the user, thereby ensuring the identification accuracy of the network attack.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a network attack identification method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an apparatus for identifying a network attack according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of an acquisition computing module according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of another network attack recognition apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic flow chart of a network attack identification method according to an embodiment of the present invention; the method may include:
s101, when the access flow state is an abnormal state, acquiring a current access data set containing current access data associated with a target page, and calculating current associated access probabilities respectively corresponding to the target page and each sub-page in a plurality of sub-pages according to a Bayesian formula and the current access data set;
specifically, the network attack recognition device provided in the embodiment of the present invention may periodically detect whether the access flow per unit time in the current network exceeds a preset flow threshold, that is, detect whether the access frequency of the page is exceeded, and if it is detected that the access flow per unit time in the current network exceeds the preset flow threshold, may determine that the access flow state is an abnormal state, otherwise determine that the access flow state is a normal state. When the access traffic state is an abnormal state, the network attack recognition device may obtain a current access data set including current access data associated with a target page, where the current access data set may include current access data corresponding to a plurality of users, and the current access data of each user includes a user IP (Internet Protocol) address of the corresponding user and an access situation of the user to the plurality of pages. When the current access data of a certain user includes an access condition to the target page (i.e., it indicates that the user has accessed the target page), it may be determined that the current access data of the user is associated with the target page, that is, it indicates that the current access data set includes the current access data associated with the target page, and of course, the current access data set also includes current access data associated with non-target pages (e.g., a plurality of pages accessed by a certain user do not include the target page).
Furthermore, according to the current access data set, the probability of accessing the corresponding sub-page can be respectively calculated, so as to obtain the sub-page access probability corresponding to each sub-page. The sub-page is pre-screened, and a plurality of sub-pages can be provided, wherein one sub-page corresponds to one sub-page access probability. For example, a sub-page a, a sub-page B, and a sub-page C are screened in advance, when the current access data set is obtained, the number of users accessing the sub-page a, the number of users accessing the sub-page B, and the number of users accessing the sub-page C in the current access data set may be respectively counted, and if the current access data set includes 100 users, where the number of users accessing the sub-page a is 10, the number of users accessing the sub-page B is 15, and the number of users accessing the sub-page C is 50, then the sub-page access probability corresponding to the sub-page a may be calculated to be 10%, the sub-page access probability corresponding to the sub-page B is 15%, and the sub-page access probability corresponding to the sub-page C is 50%.
And when the sub-page access probability is calculated, the identification device of the network attack can also respectively calculate the probability of accessing the target page under the condition of accessing the corresponding sub-page according to the current access data set so as to obtain the conditional access probability respectively corresponding to each sub-page. Wherein, the target page can be screened out in advance. For example, a target page X, a sub-page a, a sub-page B, and a sub-page C are screened in advance, and the acquired current access data set includes 100 users, where the number of users who have accessed the sub-page a and accessed the target page X is 8, the number of users who have accessed the sub-page B and accessed the target page X is 15, and the number of users who have accessed the sub-page C and accessed the target page X is 40, then the conditional access probability corresponding to the sub-page a is 8%, the conditional access probability corresponding to the sub-page B is 15%, and the conditional access probability corresponding to the sub-page C is 40% can be calculated.
Further, respectively calculating the probability of accessing the corresponding sub-page under the condition of accessing the target page according to the Bayesian formula, the access probability of each sub-page and the access probability of each condition to obtain the current associated access probability respectively corresponding between the target page and each sub-page; the Bayesian formula is specifically as follows:
Figure BDA0000996598750000041
wherein j is an integer, P (x | D)j) Means the conditional access probability, P (D), corresponding to the jth sub-pagej) Means the sub-page access probability, P (x | D), corresponding to the jth sub-pagei) Means the conditional access probability, P (D), corresponding to the ith sub-pagei) Means the access probability of the sub-page corresponding to the ith sub-page, P (D)j| x) refers to the current associated access probability corresponding to the jth sub-page.
S102, comparing the current associated access probability respectively corresponding to the target page and each sub-page with the historical associated access probability respectively corresponding to the target page and each sub-page; the historical associated access probability is calculated according to the Bayesian formula and a historical access data set acquired when the access flow state is a normal state;
specifically, after the current associated access probabilities respectively corresponding to the target page and each sub-page in the multiple sub-pages are obtained through calculation, the current associated access probabilities respectively corresponding to the target page and each sub-page may be compared with the historical associated access probabilities respectively corresponding to the target page and each sub-page; for example, if there are a sub-page a, a sub-page B, and a sub-page C, and there is a current associated access probability a1 corresponding to the target page and the sub-page a, a current associated access probability B1 corresponding to the target page and the sub-page B, a current associated access probability C1 corresponding to the target page and the sub-page C, a historical associated access probability a2 corresponding to the target page and the sub-page a, a historical associated access probability B2 corresponding to the target page and the sub-page B, and a historical associated access probability C2 corresponding to the target page and the sub-page C, then the current associated access probability a1 may be compared with the historical associated access probability a2, the current associated access probability B1 may be compared with the historical associated access probability B2, and the current associated access probability C1 may be compared with the historical associated access probability C2.
The historical associated access probability is calculated according to the Bayesian formula and a historical access data set acquired when the access flow state is a normal state. The specific steps of generating the history association access probability may be: acquiring the historical access data set in the period that the access flow state is a normal state; the historical access data set contains historical access data associated with the target page; and according to the Bayesian formula and the historical access data set, respectively corresponding historical associated access probabilities between the target page and each sub-page are calculated. The step of generating the history associated access probability may be performed before S101 or when the access traffic state is an abnormal state. For example, when the access traffic state is an abnormal state, the network attack recognition device may find the historical access data set in a history record during the period when the access traffic state is a normal state, and calculate the historical associated access probability according to the bayesian formula and the historical access data set.
The historical access data set may include historical access data corresponding to a plurality of users, and the historical access data of each user includes a user IP address of the corresponding user and access conditions of the user to a plurality of pages. When the historical access data of a certain user includes the access condition to the target page (i.e. it indicates that the user has accessed the target page), it may be determined that the historical access data of the user is associated with the target page, that is, it indicates that the historical access data set includes the historical access data associated with the target page, and of course, the historical access data set also includes the historical access data associated with the non-target page (e.g. the multiple pages accessed by a certain user do not include the target page). The specific process of calculating the historical associated access probability respectively corresponding between the target page and each sub-page according to the bayesian formula and the historical access data set may be as follows: respectively calculating the probability of accessing the corresponding sub-pages according to the historical access data set so as to obtain the historical sub-page access probability corresponding to each sub-page; respectively calculating the probability of accessing the target page under the condition of accessing the corresponding sub-page according to the historical access data set so as to obtain the historical conditional access probability respectively corresponding to each sub-page; according to the Bayesian formula, the access probability of each historical sub-page and the access probability of each historical conditional access, respectively calculating the probability of accessing the corresponding sub-page under the condition that the target page is accessed, so as to obtain historical associated access probabilities respectively corresponding to the target page and each sub-page; the sub-page, the target page and the bayesian formula used in the process of calculating the historical associated access probability are the same as those in the above S101.
S103, when the difference value between the current associated access probability and the historical associated access probability corresponding to a target sub-page in the sub-pages exceeds a preset numerical threshold, determining that the user IP address in the current access data set is an illegal IP address;
specifically, when the difference between the current associated access probability and the historical associated access probability corresponding to the target sub-page in the plurality of sub-pages exceeds a preset numerical threshold, it may be determined that the user IP address in the current access data set is an illegal IP address. For example, if the historical associated access probability corresponding to a certain sub-page is 35%, the current associated access probability corresponding to the sub-page is 85%, and the preset value threshold is 20%, it may be determined that a difference (50%) between the current associated access probability corresponding to the sub-page and the historical associated access probability exceeds the preset value threshold, and therefore, the user IP address in the current access data set may be determined to be an illegal IP address, so that a protection measure against CC attack may be performed on the current access data set.
Since CC attack is that an attacker imitates the behavior of a real user for accessing a page, which is equivalent to normal access with the depth of 1, the method can obtain the page access probability (namely the historical associated access probability) with the depth of 2 based on the Bayesian formula and by utilizing historical data training and evaluation, so that the method can avoid directly evaluating whether the access is malicious attack through the page access probability with the depth of 1. And the invention can determine whether the access habit attribute of the user corresponding to the current access data set has large change based on the probability value fluctuation condition between the historical associated access probability and the current associated access probability calculated by the Bayesian formula, thereby accurately identifying whether the simulated user with malicious attack exists on the access page at present.
When the access flow state is an abnormal state, calculating current associated access probabilities respectively corresponding to a target page and each sub-page in a plurality of sub-pages, comparing the current associated access probabilities respectively corresponding to the target page and each sub-page with historical associated access probabilities respectively corresponding to the target page and each sub-page, and determining that a user IP address in a current access data set is an illegal IP address when a difference value between the current associated access probability corresponding to the target sub-page and the historical associated access probability exceeds a preset value threshold value in the plurality of sub-pages; therefore, the invention does not rely on the labor cost any more, thereby greatly reducing the labor cost, and the invention calculates the current association access probability and the historical association access probability based on the Bayesian formula, thereby accurately analyzing the page access habit of the user, and determining whether the network attack exists at present according to the change of the page access habit of the user, thereby ensuring the identification accuracy of the network attack.
Fig. 2 is a schematic structural diagram of an identification apparatus for network attack according to an embodiment of the present invention, where the identification apparatus 1 for network attack may be applied to a server, and the identification apparatus 1 for network attack may include: the system comprises an acquisition calculation module 10, a comparison module 20, a determination module 30, a detection module 40 and a state determination module 50;
the detection module 40 is configured to detect whether an access traffic per unit time in a current network exceeds a preset traffic threshold;
the state determining module 50 is configured to determine that the access traffic state is an abnormal state if the detection module 40 detects yes;
the state determining module 50 is further configured to determine that the access traffic state is a normal state if the detection module 40 detects no;
specifically, the detecting module 40 may periodically detect whether the access flow rate per unit time in the current network exceeds a preset flow rate threshold, that is, whether the access frequency of the page is exceeded, if the detecting module 40 detects that the access flow rate per unit time in the current network exceeds the preset flow rate threshold, the state determining module 50 may determine that the access flow rate state is an abnormal state, otherwise, the state determining module 50 determines that the access flow rate state is a normal state.
The obtaining and calculating module 10 is configured to obtain a current access data set including current access data associated with a target page when an access traffic state is an abnormal state, and calculate, according to a bayesian formula and the current access data set, current associated access probabilities respectively corresponding to the target page and each of a plurality of sub-pages;
specifically, when the access traffic state is an abnormal state, the obtaining and calculating module 10 may obtain a current access data set including current access data associated with a target page, and calculate, according to a bayesian formula and the current access data set, current associated access probabilities respectively corresponding to the target page and each of a plurality of sub-pages; the current access data set may include current access data corresponding to a plurality of users, respectively, and the current access data of each user includes a user IP address of the corresponding user and access conditions of the user to a plurality of pages. When the current access data of a certain user includes an access condition to the target page (i.e., it indicates that the user has accessed the target page), it may be determined that the current access data of the user is associated with the target page, that is, it indicates that the current access data set includes the current access data associated with the target page, and of course, the current access data set also includes current access data associated with non-target pages (e.g., a plurality of pages accessed by a certain user do not include the target page).
Further, please refer to fig. 3 together, which is a schematic structural diagram of an obtaining calculation module 10 according to an embodiment of the present invention, where the obtaining calculation module 10 may include: an acquisition unit 101, a calculation unit 102;
the acquiring unit 101 is configured to acquire a current access data set including current access data associated with a target page;
the calculating unit 102 is configured to calculate probabilities of accessing corresponding sub-pages according to the current access data set, so as to obtain sub-page access probabilities corresponding to the sub-pages;
specifically, the calculating unit 102 may calculate the probability of accessing the corresponding sub-page according to the current access data set, so as to obtain the sub-page access probability corresponding to each sub-page. The sub-page is pre-screened, and a plurality of sub-pages can be provided, wherein one sub-page corresponds to one sub-page access probability. For example, a subpage a, a subpage B, and a subpage C are screened in advance, when the obtaining unit 101 obtains the current access data set, the calculating unit 102 may respectively count the number of users accessing the subpage a, the number of users accessing the subpage B, and the number of users accessing the subpage C in the current access data set, and if the current access data set includes 100 users, where the number of users accessing the subpage a is 10, the number of users accessing the subpage B is 15, and the number of users accessing the subpage C is 50, the calculating unit 102 may calculate that the subpage access probability corresponding to the subpage a is 10%, the subpage access probability corresponding to the subpage B is 15%, and the subpage access probability corresponding to the subpage C is 50%.
The calculating unit 102 is further configured to calculate, according to the current access data set, probabilities that the target page has been accessed under the condition that the corresponding sub-page has been accessed, respectively, so as to obtain conditional access probabilities that the sub-pages respectively correspond to;
specifically, the calculating unit 102 may calculate the probability that the target page is visited under the condition that the corresponding sub-page has been visited, respectively, according to the current access data set while calculating the sub-page access probability, so as to obtain the conditional access probability corresponding to each sub-page. Wherein, the target page can be screened out in advance. For example, a target page X, a sub-page a, a sub-page B, and a sub-page C are screened in advance, and the current access data set acquired by the acquisition unit 101 includes 100 users, where the number of users who have accessed the sub-page a and accessed the target page X is 8, the number of users who have accessed the sub-page B and accessed the target page X is 15, and the number of users who have accessed the sub-page C and accessed the target page X is 40, then the calculation unit 102 may calculate that the conditional access probability corresponding to the sub-page a is 8%, the conditional access probability corresponding to the sub-page B is 15%, and the conditional access probability corresponding to the sub-page C is 40%.
The calculating unit 102 is further configured to calculate, according to the bayesian formula, the access probability of each sub-page, and each conditional access probability, a probability of accessing a corresponding sub-page under the condition that the target page has been accessed, so as to obtain current associated access probabilities respectively corresponding to the target page and each sub-page;
specifically, the calculating unit 102 calculates the probability of accessing the corresponding sub-page under the condition of having accessed the target page according to the bayesian formula, the access probability of each sub-page and the access probability of each conditional access, so as to obtain the current associated access probability corresponding to each sub-page between the target page and each sub-page; the Bayesian formula is specifically as follows:
Figure BDA0000996598750000091
wherein j is an integer, P (x | D)j) Means the conditional access probability, P (D), corresponding to the jth sub-pagej) Means the sub-page access probability, P (x | D), corresponding to the jth sub-pagei) Means the conditional access probability, P (D), corresponding to the ith sub-pagei) Means the access probability of the sub-page corresponding to the ith sub-page, P (D)j| x) refers to the current associated access probability corresponding to the jth sub-page.
The comparing module 20 is configured to compare the current associated access probabilities respectively corresponding to the target page and the sub-pages with the historical associated access probabilities respectively corresponding to the target page and the sub-pages; the historical associated access probability is calculated according to the Bayesian formula and a historical access data set acquired when the access flow state is a normal state;
specifically, after the obtaining and calculating module 10 calculates and obtains the current associated access probabilities respectively corresponding to the target page and each sub-page in the multiple sub-pages, the comparing module 20 may compare the current associated access probabilities respectively corresponding to the target page and each sub-page with the historical associated access probabilities respectively corresponding to the target page and each sub-page; for example, if there are sub-page a, sub-page B, and sub-page C, and there is a current associated access probability a1 corresponding to the target page and sub-page a, a current associated access probability B1 corresponding to the target page and sub-page B, a current associated access probability C1 corresponding to the target page and sub-page C, a historical associated access probability a2 corresponding to the target page and sub-page a, a historical associated access probability B2 corresponding to the target page and sub-page B, and a historical associated access probability C2 corresponding to the target page and sub-page C, the comparison module 20 may compare the current associated access probability a1 with the historical associated access probability a2, the current associated access probability b1 with the historical associated access probability b2, and the current associated access probability c1 with the historical associated access probability c 2.
And the historical associated access probability is calculated according to the Bayesian formula and a historical access data set acquired when the access traffic state is a normal state. Specifically, the obtaining and calculating module 10 is further configured to obtain the historical access data set during a period that the access traffic state is a normal state; the historical access data set contains historical access data associated with the target page; the obtaining and calculating module 10 is further configured to calculate historical associated access probabilities respectively corresponding to the target page and the sub-pages according to the bayesian formula and the historical access data set. The process of generating the history associated access probability by the obtaining calculation module 10 may be executed before the step S101 in fig. 1 or when the access traffic state is an abnormal state. For example, when the access traffic state is an abnormal state, the obtaining calculation module 10 may find the historical access data set in the history during the period when the access traffic state is a normal state, and calculate the historical associated access probability according to the bayesian formula and the historical access data set.
The historical access data set may include historical access data corresponding to a plurality of users, and the historical access data of each user includes a user IP address of the corresponding user and access conditions of the user to a plurality of pages. When the historical access data of a certain user includes the access condition to the target page (i.e. it indicates that the user has accessed the target page), it may be determined that the historical access data of the user is associated with the target page, that is, it indicates that the historical access data set includes the historical access data associated with the target page, and of course, the historical access data set also includes the historical access data associated with the non-target page (e.g. the multiple pages accessed by a certain user do not include the target page). The specific process of calculating, by the obtaining and calculating module 10, the historical associated access probabilities respectively corresponding to the target page and the sub-pages according to the bayesian formula and the historical access data set may be: respectively calculating the probability of accessing the corresponding sub-pages according to the historical access data set to obtain historical sub-page access probabilities respectively corresponding to the sub-pages, respectively calculating the probability of accessing the target page under the condition of accessing the corresponding sub-pages according to the historical access data set to obtain historical conditional access probabilities respectively corresponding to the sub-pages, and respectively calculating the probability of accessing the corresponding sub-pages under the condition of accessing the target page according to the Bayesian formula, the historical sub-page access probabilities and the historical conditional access probabilities to obtain historical associated access probabilities respectively corresponding to the target page and the sub-pages; the sub-page, the target page and the bayesian formula used by the obtaining and calculating module 10 in the process of calculating the historical associated access probability are the same as the sub-page, the target page and the bayesian formula used by the obtaining and calculating module 10 in the process of calculating the current associated access probability.
The determining module 30 is configured to determine that the user IP address in the current access data set is an illegal IP address when a difference between the current associated access probability corresponding to the target sub-page and the historical associated access probability in the plurality of sub-pages exceeds a preset numerical threshold.
Specifically, when the difference between the current associated access probability and the historical associated access probability corresponding to the target sub-page in the plurality of sub-pages exceeds a preset numerical threshold, the determining module 30 may determine that the user IP address in the current access data set is an illegal IP address. For example, if the historical associated access probability corresponding to a certain sub-page is 35%, the current associated access probability corresponding to the sub-page is 85%, and the preset value threshold is 20%, it may be determined that a difference (50%) between the current associated access probability corresponding to the sub-page and the historical associated access probability exceeds the preset value threshold, and therefore, the determining module 30 may determine that the user IP address in the current access data set is an illegal IP address, so that a protective measure against a CC attack may be performed on the current access data set.
Since CC attack is that an attacker imitates the behavior of a real user for accessing a page, which is equivalent to normal access with the depth of 1, the method can obtain the page access probability (namely the historical associated access probability) with the depth of 2 based on the Bayesian formula and by utilizing historical data training and evaluation, so that the method can avoid directly evaluating whether the access is malicious attack through the page access probability with the depth of 1. And the invention can determine whether the access habit attribute of the user corresponding to the current access data set has large change based on the probability value fluctuation condition between the historical associated access probability and the current associated access probability calculated by the Bayesian formula, thereby accurately identifying whether the simulated user with malicious attack exists on the access page at present.
When the access flow state is an abnormal state, calculating current associated access probabilities respectively corresponding to a target page and each sub-page in a plurality of sub-pages, comparing the current associated access probabilities respectively corresponding to the target page and each sub-page with historical associated access probabilities respectively corresponding to the target page and each sub-page, and determining that a user IP address in a current access data set is an illegal IP address when a difference value between the current associated access probability corresponding to the target sub-page and the historical associated access probability exceeds a preset value threshold value in the plurality of sub-pages; therefore, the invention does not rely on the labor cost any more, thereby greatly reducing the labor cost, and the invention calculates the current association access probability and the historical association access probability based on the Bayesian formula, thereby accurately analyzing the page access habit of the user, and determining whether the network attack exists at present according to the change of the page access habit of the user, thereby ensuring the identification accuracy of the network attack.
Fig. 4 is a schematic structural diagram of another network attack recognition apparatus according to an embodiment of the present invention. The network attack recognition device 1000 may include: at least one processor 1001, such as a CPU, at least one network interface 1004, a user interface 1003, memory 1005, at least one communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display) and a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a standard wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 4, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a device control application program.
In the network attack recognition apparatus 1000 shown in fig. 4, the user interface 1003 is mainly used to provide an input interface for a user and obtain data output by the user; and the processor 1001 may be configured to invoke the device control application stored in the memory 1005 and specifically perform the following steps:
when the access flow state is an abnormal state, acquiring a current access data set containing current access data associated with a target page, and calculating current associated access probabilities respectively corresponding to the target page and each sub-page in a plurality of sub-pages according to a Bayesian formula and the current access data set;
comparing the current associated access probability respectively corresponding to the target page and each sub-page with the historical associated access probability respectively corresponding to the target page and each sub-page; the historical associated access probability is calculated according to the Bayesian formula and a historical access data set acquired when the access flow state is a normal state;
and when the difference value between the current associated access probability corresponding to the target sub-page and the historical associated access probability in the plurality of sub-pages exceeds a preset numerical threshold, determining that the user IP address in the current access data set is an illegal IP address.
In one embodiment, the processor 1001 further performs the steps of:
acquiring the historical access data set in the period that the access flow state is a normal state; the historical access data set contains historical access data associated with the target page;
and according to the Bayesian formula and the historical access data set, respectively corresponding historical associated access probabilities between the target page and each sub-page are calculated.
In an embodiment, when the processor 1001 obtains a current access data set including current access data associated with a target page, and calculates current associated access probabilities respectively corresponding to the target page and each of a plurality of sub-pages according to a bayesian formula and the current access data set, the following steps are specifically performed:
acquiring a current access data set containing current access data associated with a target page;
respectively calculating the probability of accessing the corresponding sub-pages according to the current access data set so as to obtain the sub-page access probability corresponding to each sub-page;
respectively calculating the probability of accessing the target page under the condition of accessing the corresponding sub-page according to the current access data set so as to obtain the conditional access probability corresponding to each sub-page;
and respectively calculating the probability of accessing the corresponding sub-page under the condition of accessing the target page according to the Bayesian formula, the access probability of each sub-page and the access probability of each conditional access, so as to obtain the current associated access probability respectively corresponding to the target page and each sub-page.
In one embodiment, the bayesian formula is specifically:
Figure BDA0000996598750000131
wherein j is an integer, P (x | D)j) Means the conditional access probability, P (D), corresponding to the jth sub-pagej) Means the sub-page access probability, P (x | D), corresponding to the jth sub-pagei) Means the conditional access probability, P (D), corresponding to the ith sub-pagei) Means the access probability of the sub-page corresponding to the ith sub-page, P (D)j| x) refers to the current associated access probability corresponding to the jth sub-page.
In an embodiment, before the processor 1001 obtains a current access data set including current access data associated with a target page when an access traffic state is an abnormal state, and calculates current associated access probabilities respectively corresponding to the target page and each of a plurality of sub-pages according to a bayesian formula and the current access data set, the following steps are further performed:
detecting whether the access flow in a unit time in the current network exceeds a preset flow threshold value;
if the access flow state is detected to be abnormal, determining that the access flow state is abnormal;
if the detection is no, determining that the access flow state is a normal state.
When the access flow state is an abnormal state, calculating current associated access probabilities respectively corresponding to a target page and each sub-page in a plurality of sub-pages, comparing the current associated access probabilities respectively corresponding to the target page and each sub-page with historical associated access probabilities respectively corresponding to the target page and each sub-page, and determining that a user IP address in a current access data set is an illegal IP address when a difference value between the current associated access probability corresponding to the target sub-page and the historical associated access probability exceeds a preset value threshold value in the plurality of sub-pages; therefore, the invention does not rely on the labor cost any more, thereby greatly reducing the labor cost, and the invention calculates the current association access probability and the historical association access probability based on the Bayesian formula, thereby accurately analyzing the page access habit of the user, and determining whether the network attack exists at present according to the change of the page access habit of the user, thereby ensuring the identification accuracy of the network attack.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims (8)

1. A network attack recognition method is characterized by comprising the following steps:
when the access flow state is an abnormal state, acquiring a current access data set containing current access data associated with a target page, and respectively calculating the probability of accessing corresponding sub-pages according to the current access data set to obtain the sub-page access probability respectively corresponding to each sub-page;
respectively calculating the probability of accessing the target page under the condition of accessing the corresponding sub-page according to the current access data set so as to obtain the conditional access probability corresponding to each sub-page;
according to the Bayesian formula, the access probability of each sub-page and the conditional access probability, respectively calculating the probability of accessing the corresponding sub-page under the condition of accessing the target page, so as to obtain the current associated access probability respectively corresponding to the target page and each sub-page;
comparing the current associated access probability respectively corresponding to the target page and each sub-page with the historical associated access probability respectively corresponding to the target page and each sub-page; the historical associated access probability is calculated according to the Bayesian formula and a historical access data set acquired when the access flow state is a normal state;
and when the difference value between the current associated access probability corresponding to the target sub-page and the historical associated access probability in the plurality of sub-pages exceeds a preset numerical threshold, determining that the IP address of the user network protocol in the current access data set is an illegal IP address.
2. The method of claim 1, further comprising:
acquiring the historical access data set in the period that the access flow state is a normal state; the historical access data set contains historical access data associated with the target page;
and according to the Bayesian formula and the historical access data set, respectively corresponding historical associated access probabilities between the target page and each sub-page are calculated.
3. The method of claim 1, wherein the bayesian formula is specifically:
Figure FDA0002167578520000011
wherein j is an integer, P (x | D)j) Means the conditional access probability, P (D), corresponding to the jth sub-pagej) Means the sub-page access probability, P (x | D), corresponding to the jth sub-pagei) Means the condition access corresponding to the ith sub-pageQuestion probability, P (D)i) Means the access probability of the sub-page corresponding to the ith sub-page, P (D)j| x) refers to the current associated access probability corresponding to the jth sub-page.
4. The method according to claim 1, wherein before obtaining a current access data set including current access data associated with a target page and calculating current associated access probabilities respectively corresponding to the target page and each of a plurality of sub-pages according to a bayesian formula and the current access data set when the access traffic status is an abnormal status, the method further comprises:
detecting whether the access flow in a unit time in the current network exceeds a preset flow threshold value;
if the access flow state is detected to be abnormal, determining that the access flow state is abnormal;
if the detection is no, determining that the access flow state is a normal state.
5. An apparatus for identifying a cyber attack, comprising:
the acquisition and calculation module is used for acquiring a current access data set containing current access data associated with a target page when the access flow state is an abnormal state, and calculating current associated access probabilities respectively corresponding to the target page and each sub-page in the sub-pages according to a Bayesian formula and the current access data set; the acquisition calculation module comprises an acquisition unit and a calculation unit, wherein:
the acquisition unit is used for acquiring a current access data set containing current access data associated with a target page;
the calculating unit is used for respectively calculating the probability of accessing the corresponding sub-pages according to the current access data set so as to obtain the sub-page access probability corresponding to each sub-page;
the computing unit is further configured to respectively compute probabilities that the target page is visited under the condition that the corresponding sub-page has been visited according to the current access data set, so as to obtain conditional access probabilities respectively corresponding to the sub-pages;
the calculating unit is further configured to calculate, according to the bayesian formula, the access probability of each sub-page, and each conditional access probability, a probability of accessing the corresponding sub-page under the condition that the target page has been accessed, so as to obtain current associated access probabilities respectively corresponding to the target page and each sub-page;
a comparison module, configured to compare current associated access probabilities respectively corresponding to the target page and the sub-pages with historical associated access probabilities respectively corresponding to the target page and the sub-pages; the historical associated access probability is calculated according to the Bayesian formula and a historical access data set acquired when the access flow state is a normal state;
a determining module, configured to determine that a user IP address in the current access data set is an illegal IP address when a difference between the current associated access probability corresponding to a target sub-page and the historical associated access probability in the plurality of sub-pages exceeds a preset numerical threshold.
6. The apparatus of claim 5,
the acquisition calculation module is further configured to acquire the historical access data set during a period in which the access traffic state is a normal state; the historical access data set contains historical access data associated with the target page;
the obtaining and calculating module is further configured to calculate historical associated access probabilities respectively corresponding to the target page and the sub-pages according to the bayesian formula and the historical access data set.
7. The apparatus of claim 5, wherein the Bayesian formula is specifically:
wherein j is an integer, P (x | D)j) Means the conditional access probability, P (D), corresponding to the jth sub-pagej) Means the sub-page access probability, P (x | D), corresponding to the jth sub-pagei) Means the conditional access probability, P (D), corresponding to the ith sub-pagei) Means the access probability of the sub-page corresponding to the ith sub-page, P (D)j| x) refers to the current associated access probability corresponding to the jth sub-page.
8. The apparatus of claim 5, further comprising:
the detection module is used for detecting whether the access flow in the current network per unit time exceeds a preset flow threshold value;
the state determining module is used for determining that the access flow state is an abnormal state if the detection module detects that the access flow state is the abnormal state;
the state determining module is further configured to determine that the access traffic state is a normal state if the detection module detects no.
CN201610345863.1A 2016-05-23 2016-05-23 Network attack identification method and device Active CN107426136B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610345863.1A CN107426136B (en) 2016-05-23 2016-05-23 Network attack identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610345863.1A CN107426136B (en) 2016-05-23 2016-05-23 Network attack identification method and device

Publications (2)

Publication Number Publication Date
CN107426136A CN107426136A (en) 2017-12-01
CN107426136B true CN107426136B (en) 2020-01-14

Family

ID=60422433

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610345863.1A Active CN107426136B (en) 2016-05-23 2016-05-23 Network attack identification method and device

Country Status (1)

Country Link
CN (1) CN107426136B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109167773B (en) * 2018-08-22 2021-01-26 杭州安恒信息技术股份有限公司 Access anomaly detection method and system based on Markov model
CN110505232A (en) * 2019-08-27 2019-11-26 百度在线网络技术(北京)有限公司 The detection method and device of network attack, electronic equipment, storage medium
CN111190926B (en) * 2019-11-25 2023-04-07 腾讯云计算(北京)有限责任公司 Resource caching method, device, equipment and storage medium
CN111079138A (en) * 2019-12-19 2020-04-28 北京天融信网络安全技术有限公司 Abnormal access detection method and device, electronic equipment and readable storage medium
CN111669379B (en) * 2020-05-28 2022-02-22 北京天空卫士网络安全技术有限公司 Behavior abnormity detection method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101150586A (en) * 2007-11-20 2008-03-26 杭州华三通信技术有限公司 CC attack prevention method and device
CN104935609A (en) * 2015-07-17 2015-09-23 北京京东尚科信息技术有限公司 Network attack detection method and detection apparatus
CN104967629A (en) * 2015-07-16 2015-10-07 网宿科技股份有限公司 Network attack detection method and apparatus
CN105429936A (en) * 2015-10-21 2016-03-23 北京交通大学 Defense method and apparatus of malicious occupation of storage resources in private network router

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101150586A (en) * 2007-11-20 2008-03-26 杭州华三通信技术有限公司 CC attack prevention method and device
CN104967629A (en) * 2015-07-16 2015-10-07 网宿科技股份有限公司 Network attack detection method and apparatus
CN104935609A (en) * 2015-07-17 2015-09-23 北京京东尚科信息技术有限公司 Network attack detection method and detection apparatus
CN105429936A (en) * 2015-10-21 2016-03-23 北京交通大学 Defense method and apparatus of malicious occupation of storage resources in private network router

Also Published As

Publication number Publication date
CN107426136A (en) 2017-12-01

Similar Documents

Publication Publication Date Title
CN109831465B (en) Website intrusion detection method based on big data log analysis
CN107426136B (en) Network attack identification method and device
CN110324311B (en) Vulnerability detection method and device, computer equipment and storage medium
US10284580B2 (en) Multiple detector methods and systems for defeating low and slow application DDoS attacks
CN110417778B (en) Access request processing method and device
CN108718298B (en) Malicious external connection flow detection method and device
CN108924118B (en) Method and system for detecting database collision behavior
CN110650117B (en) Cross-site attack protection method, device, equipment and storage medium
CN103929440A (en) Web page tamper prevention device based on web server cache matching and method thereof
WO2022042194A1 (en) Block detection method and apparatus for login device, server, and storage medium
CN109257390B (en) CC attack detection method and device and electronic equipment
EP3684025B1 (en) Web page request identification
US9479521B2 (en) Software network behavior analysis and identification system
CN112422554B (en) Method, device, equipment and storage medium for detecting abnormal traffic external connection
CN112437062B (en) ICMP tunnel detection method, device, storage medium and electronic equipment
CN110830445A (en) Method and device for identifying abnormal access object
CN110798488A (en) Web application attack detection method
CN111404949A (en) Flow detection method, device, equipment and storage medium
CN107231383B (en) CC attack detection method and device
CN112668005A (en) Webshell file detection method and device
CN109495471B (en) Method, device and equipment for judging WEB attack result and readable storage medium
CN109413022B (en) Method and device for detecting HTTP FLOOD attack based on user behavior
CN105939321B (en) A kind of DNS attack detection method and device
CN109218461B (en) Method and device for detecting tunnel domain name
CN107995167B (en) Equipment identification method and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231227

Address after: 518057 Tencent Building, No. 1 High-tech Zone, Nanshan District, Shenzhen City, Guangdong Province, 35 floors

Patentee after: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.

Patentee after: TENCENT CLOUD COMPUTING (BEIJING) Co.,Ltd.

Address before: 2, 518000, East 403 room, SEG science and Technology Park, Zhenxing Road, Shenzhen, Guangdong, Futian District

Patentee before: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.