CN111818011A - Abnormal access behavior recognition method and device, computer equipment and storage medium - Google Patents

Abnormal access behavior recognition method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN111818011A
CN111818011A CN202010479008.6A CN202010479008A CN111818011A CN 111818011 A CN111818011 A CN 111818011A CN 202010479008 A CN202010479008 A CN 202010479008A CN 111818011 A CN111818011 A CN 111818011A
Authority
CN
China
Prior art keywords
url
log data
access behavior
abnormal access
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010479008.6A
Other languages
Chinese (zh)
Inventor
罗振珊
唐炳武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN202010479008.6A priority Critical patent/CN111818011A/en
Publication of CN111818011A publication Critical patent/CN111818011A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Hardware Design (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application discloses an abnormal access behavior identification method, an abnormal access behavior identification device, computer equipment and a storage medium, wherein the method comprises the following steps: acquiring log data; calculating all time intervals corresponding to two adjacent accesses of the same appointed URL by the appointed user according to the log data, wherein the appointed URL is any one of all URLs contained in the log data; calling a preset calculation formula to calculate a variation coefficient related to the time interval according to the time interval; and determining whether the behavior of the specified user for accessing the specified URL is an abnormal access behavior or not according to the variation coefficient and a preset variation threshold. According to the method and the device, the time interval data are calculated from the acquired log data, the time interval data are subjected to discrete analysis to obtain the variation coefficient, and finally whether the behavior of the appointed user for accessing the appointed URL is the abnormal access behavior or not is intelligently and accurately determined according to the variation coefficient and the preset variation threshold, so that the intelligence for identifying the abnormal access behavior of the regular access URL is effectively improved.

Description

Abnormal access behavior recognition method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of mobile communications technologies, and in particular, to a method and an apparatus for identifying an abnormal access behavior, a computer device, and a storage medium.
Background
With the continuous development of internet technology and application, WEB application has gradually become an aspect that modern people cannot lack in production and life, and meanwhile, the WEB application also becomes a main attack target on the internet. The WEB log is audit information about WEB access behaviors recorded by the WEB server. The access behavior of the website access user can be known from the WEB log data, however, some abnormal access behaviors with suspected attacks also exist in the WEB log data. Currently, some common abnormal access behaviors such as plug-ins and crawlers are usually detected based on frequency statistics of a single IP, User Agent, …, and the like, threshold values are set, and identification methods such as plugging are performed for identification, however, these identification methods cannot detect some abnormal access behaviors simulating manual operation, that is, according to the operation sequence of a person, the abnormal access behaviors of URLs (uniform resource locators) are accessed at fixed time intervals in sequence, and intelligence is not high.
Disclosure of Invention
The application mainly aims to provide an abnormal access behavior identification method, an abnormal access behavior identification device, computer equipment and a storage medium, and aims to solve the technical problems that some abnormal access behaviors simulating manual operation cannot be detected by the existing identification method for identifying whether the abnormal access behaviors exist in log data and intelligence is not high.
The application provides an abnormal access behavior identification method, which comprises the following steps:
acquiring log data;
calculating all time intervals corresponding to two adjacent accesses of the same appointed URL by the appointed user according to the log data, wherein the appointed URL is any one of all URLs contained in the log data;
calling a preset calculation formula to calculate a variation coefficient related to the time interval according to the time interval;
and determining whether the behavior of the specified user for accessing the specified URL is an abnormal access behavior or not according to the variation coefficient and a preset variation threshold.
Optionally, the step of calculating all time intervals corresponding to two adjacent visits by the specified user to the same specified URL according to the log data includes:
carrying out data cleaning on the log data, and filtering redundant data to obtain filtered log data;
carrying out format normalization processing on the filtered log data to obtain log data in a standard format;
according to preset data screening conditions, screening designated log data meeting the data screening conditions from the log data in the standard format;
and calculating all time intervals corresponding to the two adjacent times of the appointed user accessing the same appointed URL according to the appointed log data.
Optionally, the step of calling a preset calculation formula to calculate a coefficient of variation associated with the time interval according to the time interval includes:
calculating the average value mu of all the time intervals; and the number of the first and second groups,
calculating the standard deviation sigma of all the time intervals;
the coefficient of variation cv is calculated by calling the calculation formula cv ═ σ/μ.
Optionally, before the step of calculating the average μ of all the time intervals, the method includes:
acquiring the quantity values of all the time intervals;
judging whether the quantity value is larger than a preset quantity threshold value or not;
and if the quantity value is judged to be larger than a preset quantity threshold value, generating a calculation instruction for calculating the average value mu of all the time intervals.
Optionally, the step of determining whether a behavior of the specified user accessing the specified URL is an abnormal access behavior according to the variation coefficient and a preset variation threshold includes:
judging whether the variation coefficient is smaller than the variation threshold value;
if the variation coefficient is smaller than the variation threshold, judging that the behavior of the specified user for accessing the specified URL is an abnormal access behavior;
and if the variation coefficient is judged to be not smaller than the variation threshold, judging that the behavior of the specified user for accessing the specified URL is not abnormal access behavior.
Optionally, after the step of determining whether the behavior of the specified user accessing the specified URL is an abnormal access behavior according to the variation coefficient and a preset variation threshold, the method includes:
after determining whether the access behaviors corresponding to all URLs in the log data are abnormal access behaviors or not, screening out specific URLs corresponding to the abnormal access behaviors, wherein the number of the specific URLs is one or more;
acquiring specific access behavior information corresponding to the specific URL;
and displaying the specific access behavior information.
Optionally, after the step of screening out a specific URL corresponding to an abnormal access behavior after determining whether behaviors of accessing all URLs in the log data are abnormal access behaviors respectively, the method includes:
when an access event that a specific user accesses the specific URL is detected again, generating alarm information corresponding to the access event;
limiting the execution of response operation on the specific access behavior corresponding to the access event;
extracting specific user information from specific access behavior information corresponding to the specific URL;
and sending reminding information for modifying the personal information to a specific user terminal corresponding to the specific user information.
The present application further provides an abnormal access behavior recognition apparatus, including:
the first acquisition module is used for acquiring log data;
the first calculation module is used for calculating all time intervals corresponding to two adjacent visits of the same appointed URL by the appointed user according to the log data, wherein the appointed URL is any one of all URLs contained in the log data;
the second calculation module is used for calling a preset calculation formula to calculate a variation coefficient related to the time interval according to the time interval;
and the determining module is used for determining whether the behavior of the specified user for accessing the specified URL is an abnormal access behavior according to the variation coefficient and a preset variation threshold.
The present application further provides a computer device, comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the above method when executing the computer program.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method.
The abnormal access behavior identification method, the abnormal access behavior identification device, the computer equipment and the storage medium have the following beneficial effects:
the method and the device for identifying the abnormal access behavior, the computer equipment and the storage medium provided by the application are used for acquiring log data; calculating all time intervals corresponding to two adjacent accesses of the same appointed URL by the appointed user according to the log data, wherein the appointed URL is any one of all URLs contained in the log data; calling a preset calculation formula to calculate a variation coefficient related to the time interval according to the time interval; and determining whether the behavior of the specified user for accessing the specified URL is an abnormal access behavior or not according to the variation coefficient and a preset variation threshold. According to the method and the device, the time interval data are calculated from the acquired log data, the relevant calculation formula is called to carry out discrete analysis on the time interval data to obtain the variation coefficient, and finally whether the behavior of the appointed user for accessing the appointed URL is an abnormal access behavior or not is intelligently determined according to the obtained variation coefficient and the preset variation threshold value, so that whether the abnormal access behavior of regularly accessing the URL exists in the log data or not can be accurately and quickly identified, and the intelligence of identifying the abnormal access behavior of regularly accessing the URL is improved.
Drawings
Fig. 1 is a schematic flowchart of an abnormal access behavior identification method according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of an abnormal access behavior recognition apparatus according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present application.
The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It should be noted that all directional indicators (such as upper, lower, left, right, front and rear … …) in the embodiments of the present application are only used to explain the relative position relationship between the components, the movement situation, etc. in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indicator is changed accordingly, and the connection may be a direct connection or an indirect connection.
Referring to fig. 1, an abnormal access behavior identification method according to an embodiment of the present application includes:
s1: acquiring log data;
s2: calculating all time intervals corresponding to two adjacent accesses of the same appointed URL by the appointed user according to the log data, wherein the appointed URL is any one of all URLs contained in the log data;
s3: calling a preset calculation formula to calculate a variation coefficient related to the time interval according to the time interval;
s4: and determining whether the behavior of the specified user for accessing the specified URL is an abnormal access behavior or not according to the variation coefficient and a preset variation threshold.
As described in the above steps S1 to S4, the execution subject of the embodiment of the method is an abnormal access behavior recognition device. In practical applications, the abnormal access behavior recognition device may be implemented by a virtual device, such as a software code, or by an entity device in which a relevant execution code is written or integrated, and may perform human-computer interaction with a user through a keyboard, a mouse, a remote controller, a touch panel, or a voice control device. The abnormal access behavior recognition device based on log data provided by the embodiment can intelligently and accurately recognize the abnormal access behavior existing in the log data. Specifically, log data is first acquired. The WEB log data related to the URL access behavior can be acquired from some big data platforms. And after the log data are obtained, calculating all time intervals corresponding to two adjacent accesses of the same appointed URL by the appointed user according to the log data, wherein the appointed URL is any one of all URLs contained in the log data. In addition, the specified log data can be obtained by performing data cleaning, filtering, format normalization and screening processing on the acquired log data, and the time interval can be calculated according to the specified log data. Specifically, the time interval refers to a time difference between two access times corresponding to two adjacent accesses of the same designated URL by the designated user, and the access time may be in a format of year, month, day, hour, minute and second, and the time interval is in units of seconds. For example, if the first access time and the second access time of a given user accessing a given URL two times in a row are 2020/3/3/12:24:30, 2020/3/3/12:24:52, respectively, the corresponding time interval can be calculated to be 22 seconds. After the time interval is obtained, a preset calculation formula is called to calculate the coefficient of variation related to the time interval according to the time interval. Specifically, the calculation formula may be such that the coefficient of variation cv is σ/μ, σ is a standard deviation of all the time intervals, and μ is an average value of all the time intervals. And finally, when the variation coefficient is obtained, determining whether the behavior of the specified user for accessing the specified URL is an abnormal access behavior according to the variation coefficient and a preset variation threshold. The specific generation method of the variation threshold is not particularly limited, and may be, for example, the apparatus itself generated from a conventional data processing record, or may be determined by a user according to personal needs and input into the apparatus. In addition, the calculated variation coefficient can be compared with a preset variation threshold value to obtain a comparison result, and whether the behavior of the specified user accessing the specified URL is an abnormal access behavior or not can be intelligently and accurately judged according to the comparison result. In the embodiment, time interval data is calculated from the acquired log data, a related calculation formula is called to perform discrete analysis on the time interval data to obtain a variation coefficient, and finally whether the behavior of the specified user accessing the specified URL is an abnormal access behavior is intelligently determined according to the obtained variation coefficient and a preset variation threshold, so that whether the abnormal access behavior of the regular access URL exists in the log data can be accurately and quickly identified, and the intelligence of identifying the abnormal access behavior of the regular access URL is improved.
Further, in an embodiment of the present application, the step S2 includes:
s200: carrying out data cleaning on the log data, and filtering redundant data to obtain filtered log data;
s201: carrying out format normalization processing on the filtered log data to obtain log data in a standard format;
s202: according to preset data screening conditions, screening designated log data meeting the data screening conditions from the log data in the standard format;
s203: and calculating all time intervals corresponding to the two adjacent times of the appointed user accessing the same appointed URL according to the appointed log data.
As described in the foregoing steps S200 to S203, the step of calculating all time intervals corresponding to two adjacent accesses of the same designated URL by the designated user according to the log data may specifically include: firstly, data cleaning is carried out on the log data, redundant data are filtered, and the filtered log data are obtained. The original log data is cleaned, so that useless redundant data are filtered, the subsequent data processing amount is reduced, and the data processing speed is increased. And when the filtered log data is obtained, performing format normalization processing on the filtered log data to obtain the log data in a standard format. For the log data acquired from the big data platform, because the formats of the log data acquired from different data sources may be different, the format of the log data needs to be standardized, so that the format of the log data is changed into a standard format, and therefore in the subsequent process of screening out the specified log data, only data screening conditions need to be set for the data in one standard format, and data screening conditions do not need to be established for the data in multiple formats, so that the subsequent data processing amount is effectively reduced. And then screening the designated log data meeting the data screening conditions from the log data in the standard format according to preset data screening conditions. The data filtering condition is a condition related to data required for calculating the time interval, and the designated log data meeting the data filtering condition may specifically include data such as a user ID, an operation date, an operation time, an accessed URL, and a client IP. And finally, according to the designated log data, calculating all time intervals corresponding to two adjacent times of the designated user accessing the same designated URL. The time interval refers to a time difference between two access times corresponding to two adjacent accesses of the same designated URL by the designated user, the access times can be in a year, month, day, hour, minute and second format, and the time interval is in seconds. In the embodiment, the designated log data is obtained by performing data cleaning, filtering, format normalization and screening on the acquired log data, and then the time interval is calculated according to the designated log data, so that the time interval can be conveniently and quickly calculated by calling a preset calculation formula according to the time interval.
Further, in an embodiment of the present application, the step S3 includes:
s300: calculating the average value mu of all the time intervals; and the number of the first and second groups,
s301: calculating the standard deviation sigma of all the time intervals;
s302: the coefficient of variation cv is calculated by calling the calculation formula cv ═ σ/μ.
As described in the foregoing steps S300 to S302, the step of calling a predetermined calculation formula to calculate the coefficient of variation associated with the time interval according to the time interval may specifically include: first the average value μ of all the above time intervals is calculated. At the same time as the mean value μ is calculated, the standard deviation σ of all the time intervals can also be calculated together. After the average value μ and the standard deviation σ are obtained, the calculation formula cv is called σ/μ, and the coefficient of variation cv is calculated. The variation coefficient is the same as the range, the standard deviation and the variance and is an absolute value reflecting the data discrete degree, and the data size corresponding to the variation coefficient is not only influenced by the discrete degree of the variable value but also influenced by the average level of the variable value. The influence of measurement scales and dimensions can be eliminated by using the coefficient of variation, and objective comparison of data is further realized. In this embodiment, the calculation formula related to the data discrete analysis is used to quickly calculate the variation coefficient related to the interval time, which is beneficial to determine whether the behavior of the specified user accessing the specified URL is an abnormal access behavior intelligently and quickly according to the calculated variation coefficient cv and the preset variation threshold.
Further, in an embodiment of the present application, before the step S300, the method includes:
s3000: acquiring the quantity values of all the time intervals;
s3001: judging whether the quantity value is larger than a preset quantity threshold value or not;
s3002: and if the quantity value is judged to be larger than a preset quantity threshold value, generating a calculation instruction for calculating the average value mu of all the time intervals.
As described in steps S3000 to S3002, before the step of calculating the average value μ of all the time intervals, the method further includes: and acquiring and calculating the quantity value of all the time intervals. And after the quantity value is obtained, judging whether the quantity value is larger than a preset quantity threshold value. The specific generation method of the number threshold is not particularly limited, and for example, the number threshold may be set to 5, for example, the number threshold may be generated by the apparatus by performing a large amount of test data statistics according to a conventional data processing record, or may be determined and input into the apparatus by a user according to personal needs. And if the quantity value is judged to be larger than a preset quantity threshold value, generating a calculation instruction for calculating the average value mu of all the time intervals. When the quantity value is smaller than a preset quantity threshold, the access quantity of the specified user to the specified URL is considered to be too small, so that the abnormal identification judgment on whether the behavior of the specified user accessing the specified URL is an abnormal access behavior cannot be carried out from the time interval data with small quantity. Further, if the quantity value is not greater than the preset quantity threshold value, acquiring the specified URL access behavior information corresponding to the specified URL, and sending the specified URL access behavior information to a terminal of a related operation and maintenance person, so that the operation and maintenance person can perform related communication with the specified user, and further determining whether the behavior of the specified user accessing the specified URL is an abnormal access behavior according to the obtained communication information. In this embodiment, only when the number value is greater than the preset number threshold, the calculation instruction for calculating the average value μ of all the time intervals is generated, so that the subsequent process of calculating the variation coefficient can be avoided when the number value is less than the preset number threshold, and the energy consumption of data processing is effectively saved.
Further, in an embodiment of the present application, the step S4 includes:
s400: judging whether the variation coefficient is smaller than the variation threshold value;
s401: if the variation coefficient is smaller than the variation threshold, judging that the behavior of the specified user for accessing the specified URL is an abnormal access behavior;
s402: and if the variation coefficient is judged to be not smaller than the variation threshold, judging that the behavior of the specified user for accessing the specified URL is not abnormal access behavior.
As described in the foregoing steps S400 to S402, the step of determining whether the behavior of the specified user accessing the specified URL is an abnormal access behavior according to the variation coefficient and a preset variation threshold may specifically include: first, it is determined whether the variance coefficient is smaller than the variance threshold. The specific generation method of the variation threshold is not particularly limited, and the variation threshold may be set to 0.01, for example, by the apparatus performing statistics of a large amount of test data according to a conventional data processing record and then generating itself, or by the user determining and inputting the result into the apparatus according to personal needs. And if the variation coefficient is smaller than the variation threshold, judging that the behavior of the specified user for accessing the specified URL is abnormal access behavior. The smaller the coefficient of variation value, the higher the degree of variation of data is, and thus when the coefficient of variation is smaller than the variation threshold, it is indicated that the specified URL has regular access behavior, so that it is possible to determine that the behavior of the specified user accessing the specified URL is abnormal access behavior. And if the variation coefficient is judged to be not smaller than the variation threshold, the discrete degree of the access interval is low, namely the designated URL does not have regular access behavior, the behavior that the designated user accesses the designated URL is judged not to be abnormal access behavior. According to the embodiment, the calculated variation coefficient is compared with the preset variation threshold value to obtain the comparison result, so that whether the behavior of the specified user accessing the specified URL is an abnormal access behavior can be intelligently and accurately judged according to the comparison result, and the intelligence of identifying the abnormal access behavior of the regular access URL is effectively improved.
In an embodiment of the application, after the step S4, the method includes:
s410: after determining whether the access behaviors corresponding to all URLs in the log data are abnormal access behaviors or not, screening out specific URLs corresponding to the abnormal access behaviors, wherein the number of the specific URLs is one or more;
s411: acquiring specific access behavior information corresponding to the specific URL;
s412: and displaying the specific access behavior information.
As described in the foregoing steps S410 to S412, after the step of determining whether the behavior of the specified user accessing the specified URL is an abnormal access behavior according to the variation coefficient and the preset variation threshold, all specific URL access behaviors corresponding to the abnormal access behavior may be further searched from the log data and displayed, so as to provide a corresponding warning and reminding function for the user. Specifically, after determining whether the access behaviors corresponding to all URLs accessing the log data are abnormal access behaviors, a specific URL corresponding to the abnormal access behavior is screened out, wherein the number of the specific URLs is one or more, and the specific URL may be marked, for example, by being bolded or highlighted. In addition, for the determination method of the specific URL corresponding to the abnormal access behavior, reference may be made to a determination process of whether the behavior of the specified user accessing the specified URL is the abnormal access behavior, which is not described herein again. And then specific access behavior information corresponding to the specific URL is acquired. The specific access behavior information at least comprises specific user ID corresponding to the specific URL, operation date, operation time, client IP and other information. And finally, displaying the specific access behavior information so that a user can know all abnormal access behaviors about the URL in the log data according to the displayed specific access behavior information. The specific access behavior information can be input into a preset form template to generate corresponding form data, and the form data is displayed, so that a user can more clearly know all abnormal access behaviors in the log data.
Further, in an embodiment of the present application, after the step S410, the method includes:
s4100: when an access event that a specific user accesses the specific URL is detected again, generating alarm information corresponding to the access event;
s4101: limiting the execution of response operation on the specific access behavior corresponding to the access event;
s4102: extracting specific user information from specific access behavior information corresponding to the specific URL;
s4103: and sending reminding information for modifying the personal information to a specific user terminal corresponding to the specific user information.
As described in steps S4100 to S4103, after the step of screening out the specific URL corresponding to the abnormal access behavior after determining whether the access behaviors corresponding to all URLs in the log data are abnormal access behaviors, respectively, the subsequent limitation processing may be performed on the behavior of the specific user to access the specific URL again. Specifically, when an access event that a specific user accesses the specific URL is received again, corresponding warning information is generated and displayed on a screen, so that operation and maintenance personnel are reminded of the current abnormal access behavior of the specific URL through the warning information. After the alarm information is generated, the response operation of the specific access behavior of the specific URL is further limited, so that adverse effects caused by responding to the specific access behavior are effectively avoided, wherein the specific access behavior is a behavior of attacking a system by accessing the specific URL. And then extracting specific user information from the specific access behavior information corresponding to the specific URL. The specific user information at least includes contact information of the user, such as mailbox information. After the specific user information is obtained, reminding information for modifying the personal information is sent to the specific user terminal corresponding to the specific user information. The sending method of the reminding information is not particularly limited, and for example, a mail, a short message, or the like may be used. In addition, the specific user is reminded to modify the personal information, so that the source of the specific access behavior of the specific URL appearing later is cut off from the source, and the specific user can have normal use experience in the URL access behavior process.
Referring to fig. 2, an embodiment of the present application further provides an abnormal access behavior recognition apparatus, including:
the first acquisition module 1 is used for acquiring log data;
the first calculating module 2 is configured to calculate, according to the log data, all time intervals at which an appointed user accesses the same appointed URL twice in an adjacent manner, where the appointed URL is any one of all URLs included in the log data;
the second calculating module 3 is configured to call a preset calculation formula to calculate a variation coefficient related to the time interval according to the time interval;
and the determining module 4 is configured to determine whether a behavior of the specified user accessing the specified URL is an abnormal access behavior according to the variation coefficient and a preset variation threshold.
In this embodiment, the implementation processes of the functions and actions of the first obtaining module, the first calculating module, the second calculating module and the determining module in the abnormal access behavior identification apparatus are specifically described in the implementation processes corresponding to steps S1 to S4 in the abnormal access behavior identification method, and are not described herein again.
Further, in an embodiment of the application, the first calculating module includes:
the filtering unit is used for carrying out data cleaning on the log data and filtering redundant data to obtain filtered log data;
the processing unit is used for carrying out format normalization processing on the filtered log data to obtain the log data in a standard format;
the screening unit is used for screening designated log data meeting the data screening conditions from the log data in the standard format according to preset data screening conditions;
and the first calculating unit is used for calculating all time intervals corresponding to two adjacent accesses of the same appointed URL by the appointed user according to the appointed log data.
In this embodiment, the implementation processes of the functions and actions of the filtering unit, the processing unit, the screening unit and the first computing unit in the abnormal access behavior identification apparatus are specifically described in the implementation processes corresponding to steps S200 to S203 in the abnormal access behavior identification method, and are not described herein again.
Further, in an embodiment of the application, the second calculating module includes:
a second calculation unit for calculating an average value μ of all the time intervals; and the number of the first and second groups,
a third calculating unit, configured to calculate a standard deviation σ of all the time intervals;
and a fourth calculating unit, configured to call a calculation formula cv ═ σ/μ, and calculate the coefficient of variation cv.
In this embodiment, the implementation processes of the functions and actions of the second computing unit, the third computing unit, and the fourth computing unit in the abnormal access behavior identification apparatus are specifically described in the implementation processes corresponding to steps S300 to S302 in the abnormal access behavior identification method, and are not described herein again.
Further, in an embodiment of the application, the second calculating module includes:
the acquisition unit is used for acquiring the quantity values of all the time intervals;
the first judgment unit is used for judging whether the quantity value is larger than a preset quantity threshold value or not;
and the generating unit is used for generating a calculation instruction for calculating the average value mu of all the time intervals if the quantity value is judged to be larger than a preset quantity threshold value.
In this embodiment, the implementation processes of the functions and functions of the obtaining unit, the first determining unit, and the generating unit in the abnormal access behavior identification apparatus are specifically described in the implementation processes corresponding to steps S3000 to S3002 in the abnormal access behavior identification method, and are not described herein again.
Further, in an embodiment of the present application, the determining module includes:
a second determining unit, configured to determine whether the variation coefficient is smaller than the variation threshold;
a first determining unit, configured to determine that a behavior of the specified user accessing the specified URL is an abnormal access behavior if it is determined that the variation coefficient is smaller than the variation threshold;
and the second judging unit is used for judging that the behavior of the specified user for accessing the specified URL is not abnormal access behavior if the variation coefficient is judged to be not smaller than the variation threshold.
In this embodiment, the implementation processes of the functions and actions of the second determining unit, the first determining unit and the second determining unit in the abnormal access behavior identification apparatus are specifically described in the implementation processes corresponding to steps S400 to S402 in the abnormal access behavior identification method, and are not described herein again.
Further, in an embodiment of the present application, the abnormal access behavior recognizing apparatus includes:
the screening module is used for screening out specific URLs corresponding to abnormal access behaviors after determining whether the access behaviors corresponding to all URLs in the log data are abnormal access behaviors or not, wherein the number of the specific URLs is one or more;
the second acquisition module is used for acquiring specific access behavior information corresponding to the specific URL;
and the display module is used for displaying the specific access behavior information.
In this embodiment, the implementation processes of the functions and functions of the screening module, the second obtaining module, and the displaying module in the abnormal access behavior identification apparatus are specifically described in the implementation processes corresponding to steps S410 to S412 in the abnormal access behavior identification method, and are not described herein again.
Further, in an embodiment of the present application, the abnormal access behavior recognizing apparatus includes:
the generating module is used for generating alarm information corresponding to the access event when the access event that a specific user accesses the specific URL is detected again;
the limiting module is used for limiting the execution of response operation on the specific access behavior corresponding to the access event;
the extraction module is used for extracting specific user information from the specific access behavior information corresponding to the specific URL;
and the reminding module is used for sending reminding information for modifying the personal information to the specific user terminal corresponding to the specific user information.
In this embodiment, the implementation processes of the functions and functions of the generation module, the limitation module, the extraction module, and the reminding module in the abnormal access behavior identification apparatus are specifically described in the implementation processes corresponding to steps S4100 to S4103 in the abnormal access behavior identification method, and are not described herein again.
Referring to fig. 3, a computer device, which may be a server and whose internal structure may be as shown in fig. 3, is also provided in the embodiment of the present application. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is designed to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing log data, time intervals, variation coefficients, variation thresholds and other data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an abnormal access behavior recognition method.
The processor executes the method for identifying the abnormal access behavior, and comprises the following steps:
acquiring log data;
calculating all time intervals corresponding to two adjacent accesses of the same appointed URL by the appointed user according to the log data, wherein the appointed URL is any one of all URLs contained in the log data;
calling a preset calculation formula to calculate a variation coefficient related to the time interval according to the time interval;
and determining whether the behavior of the specified user for accessing the specified URL is an abnormal access behavior or not according to the variation coefficient and a preset variation threshold.
Those skilled in the art will appreciate that the structure shown in fig. 3 is only a block diagram of a part of the structure related to the present application, and does not constitute a limitation to the apparatus and the computer device to which the present application is applied.
An embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, where when the computer program is executed by a processor, the method for identifying an abnormal access behavior is implemented, and specifically:
acquiring log data;
calculating all time intervals corresponding to two adjacent accesses of the same appointed URL by the appointed user according to the log data, wherein the appointed URL is any one of all URLs contained in the log data;
calling a preset calculation formula to calculate a variation coefficient related to the time interval according to the time interval;
and determining whether the behavior of the specified user for accessing the specified URL is an abnormal access behavior or not according to the variation coefficient and a preset variation threshold.
In summary, the method, the apparatus, the computer device and the storage medium for identifying abnormal access behaviors provided in the embodiment of the present application obtain log data; calculating all time intervals corresponding to two adjacent accesses of the same appointed URL by the appointed user according to the log data, wherein the appointed URL is any one of all URLs contained in the log data; calling a preset calculation formula to calculate a variation coefficient related to the time interval according to the time interval; and determining whether the behavior of the specified user for accessing the specified URL is an abnormal access behavior or not according to the variation coefficient and a preset variation threshold. According to the method and the device, the time interval data are calculated from the acquired log data, the relevant calculation formula is called to carry out discrete analysis on the time interval data to obtain the variation coefficient, and finally whether the behavior of the appointed user for accessing the appointed URL is an abnormal access behavior or not is intelligently determined according to the obtained variation coefficient and the preset variation threshold, so that whether the abnormal access behavior of regularly accessing the URL exists in the log data or not can be accurately and quickly identified, and the intelligence of identifying the abnormal access behavior of regularly accessing the URL is improved.
It will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by hardware associated with instructions of a computer program, which may be stored on a non-volatile computer-readable storage medium, and when executed, may include processes of the above embodiments of the methods. Any reference to memory, storage, database, or other medium provided herein and used in the examples may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double-rate SDRAM (SSRSDRAM), Enhanced SDRAM (ESDRAM), synchronous link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application, or which are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims (10)

1. An abnormal access behavior recognition method, comprising:
acquiring log data;
calculating all time intervals corresponding to two adjacent accesses of the same appointed URL by the appointed user according to the log data, wherein the appointed URL is any one of all URLs contained in the log data;
calling a preset calculation formula to calculate a variation coefficient related to the time interval according to the time interval;
and determining whether the behavior of the specified user for accessing the specified URL is an abnormal access behavior or not according to the variation coefficient and a preset variation threshold.
2. The method for identifying abnormal access behavior according to claim 1, wherein the step of calculating all time intervals between two adjacent accesses of the specified user to the same specified URL according to the log data comprises:
carrying out data cleaning on the log data, and filtering redundant data to obtain filtered log data;
carrying out format normalization processing on the filtered log data to obtain log data in a standard format;
according to preset data screening conditions, screening designated log data meeting the data screening conditions from the log data in the standard format;
and calculating all time intervals corresponding to the two adjacent times of the appointed user accessing the same appointed URL according to the appointed log data.
3. The method for identifying abnormal access behavior according to claim 1, wherein the step of calling a preset calculation formula to calculate the coefficient of variation associated with the time interval according to the time interval comprises:
calculating the average value mu of all the time intervals; and the number of the first and second groups,
calculating the standard deviation sigma of all the time intervals;
the coefficient of variation cv is calculated by calling the calculation formula cv ═ σ/μ.
4. The abnormal access behavior recognition method of claim 3, wherein said step of calculating the average μ of all said time intervals is preceded by:
acquiring the quantity values of all the time intervals;
judging whether the quantity value is larger than a preset quantity threshold value or not;
and if the quantity value is judged to be larger than a preset quantity threshold value, generating a calculation instruction for calculating the average value mu of all the time intervals.
5. The method according to claim 1, wherein the step of determining whether the behavior of the specified user accessing the specified URL is an abnormal access behavior according to the variation coefficient and a preset variation threshold includes:
judging whether the variation coefficient is smaller than the variation threshold value;
if the variation coefficient is smaller than the variation threshold, judging that the behavior of the specified user for accessing the specified URL is an abnormal access behavior;
and if the variation coefficient is judged to be not smaller than the variation threshold, judging that the behavior of the specified user for accessing the specified URL is not abnormal access behavior.
6. The method for identifying an abnormal access behavior according to claim 1, wherein after the step of determining whether the behavior of the specified user accessing the specified URL is an abnormal access behavior according to the variation coefficient and a preset variation threshold, the method comprises:
after determining whether the access behaviors corresponding to all URLs in the log data are abnormal access behaviors or not, screening out specific URLs corresponding to the abnormal access behaviors, wherein the number of the specific URLs is one or more;
acquiring specific access behavior information corresponding to the specific URL;
and displaying the specific access behavior information.
7. The abnormal access behavior recognition method according to claim 6, wherein after the step of screening out the specific URL corresponding to the abnormal access behavior after respectively determining whether the behaviors of accessing all URLs in the log data are the abnormal access behaviors, the method comprises:
when an access event that a specific user accesses the specific URL is detected again, generating alarm information corresponding to the access event;
limiting the execution of response operation on the specific access behavior corresponding to the access event;
extracting specific user information from specific access behavior information corresponding to the specific URL;
and sending reminding information for modifying the personal information to a specific user terminal corresponding to the specific user information.
8. An abnormal access behavior recognition apparatus, comprising:
the first acquisition module is used for acquiring log data;
the first calculation module is used for calculating all time intervals corresponding to two adjacent visits of the same appointed URL by the appointed user according to the log data, wherein the appointed URL is any one of all URLs contained in the log data;
the second calculation module is used for calling a preset calculation formula to calculate a variation coefficient related to the time interval according to the time interval;
and the determining module is used for determining whether the behavior of the specified user for accessing the specified URL is an abnormal access behavior according to the variation coefficient and a preset variation threshold.
9. A computer device comprising a memory and a processor, the memory having stored therein a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method according to any one of claims 1 to 7.
10. A storage medium having a computer program stored thereon, the computer program, when being executed by a processor, realizing the steps of the method of any one of claims 1 to 7.
CN202010479008.6A 2020-05-29 2020-05-29 Abnormal access behavior recognition method and device, computer equipment and storage medium Pending CN111818011A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010479008.6A CN111818011A (en) 2020-05-29 2020-05-29 Abnormal access behavior recognition method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010479008.6A CN111818011A (en) 2020-05-29 2020-05-29 Abnormal access behavior recognition method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111818011A true CN111818011A (en) 2020-10-23

Family

ID=72847836

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010479008.6A Pending CN111818011A (en) 2020-05-29 2020-05-29 Abnormal access behavior recognition method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111818011A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112486816A (en) * 2020-11-27 2021-03-12 北京知道未来信息技术有限公司 Test method, test device, electronic equipment and storage medium
CN112579418A (en) * 2020-12-25 2021-03-30 泰康保险集团股份有限公司 Method, device, equipment and computer readable medium for identifying access log
CN113360899A (en) * 2021-07-06 2021-09-07 上海观安信息技术股份有限公司 Machine behavior identification method and system
CN113569949A (en) * 2021-07-28 2021-10-29 广州博冠信息科技有限公司 Abnormal user identification method and device, electronic equipment and storage medium
CN114389875A (en) * 2022-01-12 2022-04-22 国网山东省电力公司泰安供电公司 Man-machine behavior detection method, system, equipment and medium
CN114780400A (en) * 2022-04-18 2022-07-22 南京安元科技有限公司 Method for blocking cyclic calling among services based on periodic data balance statistics
CN116488948A (en) * 2023-06-25 2023-07-25 上海观安信息技术股份有限公司 Machine behavior abnormality detection method, device, equipment and medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104113519A (en) * 2013-04-16 2014-10-22 阿里巴巴集团控股有限公司 Network attack detection method and device thereof
CN104935609A (en) * 2015-07-17 2015-09-23 北京京东尚科信息技术有限公司 Network attack detection method and detection apparatus
CN106850511A (en) * 2015-12-07 2017-06-13 阿里巴巴集团控股有限公司 Identification accesses the method and device attacked
CN107679626A (en) * 2017-10-10 2018-02-09 上海优刻得信息科技有限公司 Machine learning method, device, system, storage medium and equipment
CN108540431A (en) * 2017-03-03 2018-09-14 阿里巴巴集团控股有限公司 The recognition methods of account type, device and system
CN109033319A (en) * 2018-07-18 2018-12-18 长扬科技(北京)有限公司 A kind of big data log method for normalizing and tool
CN109739821A (en) * 2018-12-18 2019-05-10 中国科学院计算机网络信息中心 Daily record data bedding storage method, apparatus and storage medium
CN110213238A (en) * 2019-05-06 2019-09-06 北京奇安信科技有限公司 Threat detection method and device, storage medium, the computer equipment of data
CN110377846A (en) * 2019-07-25 2019-10-25 腾讯科技(深圳)有限公司 Social networks method for digging, device, storage medium and computer equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104113519A (en) * 2013-04-16 2014-10-22 阿里巴巴集团控股有限公司 Network attack detection method and device thereof
CN104935609A (en) * 2015-07-17 2015-09-23 北京京东尚科信息技术有限公司 Network attack detection method and detection apparatus
CN106850511A (en) * 2015-12-07 2017-06-13 阿里巴巴集团控股有限公司 Identification accesses the method and device attacked
CN108540431A (en) * 2017-03-03 2018-09-14 阿里巴巴集团控股有限公司 The recognition methods of account type, device and system
CN107679626A (en) * 2017-10-10 2018-02-09 上海优刻得信息科技有限公司 Machine learning method, device, system, storage medium and equipment
CN109033319A (en) * 2018-07-18 2018-12-18 长扬科技(北京)有限公司 A kind of big data log method for normalizing and tool
CN109739821A (en) * 2018-12-18 2019-05-10 中国科学院计算机网络信息中心 Daily record data bedding storage method, apparatus and storage medium
CN110213238A (en) * 2019-05-06 2019-09-06 北京奇安信科技有限公司 Threat detection method and device, storage medium, the computer equipment of data
CN110377846A (en) * 2019-07-25 2019-10-25 腾讯科技(深圳)有限公司 Social networks method for digging, device, storage medium and computer equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谢中华: "新编MATLAB/Simulink自学一本通", 31 December 2018, 北京航空航天出版社, pages: 359 - 363 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112486816A (en) * 2020-11-27 2021-03-12 北京知道未来信息技术有限公司 Test method, test device, electronic equipment and storage medium
CN112486816B (en) * 2020-11-27 2024-04-02 北京知道未来信息技术有限公司 Test method, test device, electronic equipment and storage medium
CN112579418A (en) * 2020-12-25 2021-03-30 泰康保险集团股份有限公司 Method, device, equipment and computer readable medium for identifying access log
CN113360899A (en) * 2021-07-06 2021-09-07 上海观安信息技术股份有限公司 Machine behavior identification method and system
CN113360899B (en) * 2021-07-06 2023-11-21 上海观安信息技术股份有限公司 Machine behavior recognition method and system
CN113569949A (en) * 2021-07-28 2021-10-29 广州博冠信息科技有限公司 Abnormal user identification method and device, electronic equipment and storage medium
CN114389875A (en) * 2022-01-12 2022-04-22 国网山东省电力公司泰安供电公司 Man-machine behavior detection method, system, equipment and medium
CN114389875B (en) * 2022-01-12 2024-01-16 国网山东省电力公司泰安供电公司 Man-machine behavior detection method, system, equipment and medium
CN114780400A (en) * 2022-04-18 2022-07-22 南京安元科技有限公司 Method for blocking cyclic calling among services based on periodic data balance statistics
CN116488948A (en) * 2023-06-25 2023-07-25 上海观安信息技术股份有限公司 Machine behavior abnormality detection method, device, equipment and medium
CN116488948B (en) * 2023-06-25 2023-09-01 上海观安信息技术股份有限公司 Machine behavior abnormality detection method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN111818011A (en) Abnormal access behavior recognition method and device, computer equipment and storage medium
CN108304498B (en) Webpage data acquisition method and device, computer equipment and storage medium
CN108399124B (en) Application testing method and device, computer equipment and storage medium
CN110868378A (en) Phishing mail detection method and device, electronic equipment and storage medium
CN113299401B (en) Infectious disease data transmission monitoring method and device, computer equipment and medium
CN111666573A (en) Method and device for evaluating vulnerability grade of website system and computer equipment
CN111556070A (en) Webpage abnormal access detection method and device
CN112580047A (en) Industrial malicious code marking method, equipment, storage medium and device
CN112083973A (en) Window closing method and device, electronic equipment and storage medium
KR20180117460A (en) Method for detecting issue based on trend analysis device thereof
CN114996103A (en) Page abnormity detection method and device, electronic equipment and storage medium
CN113987182A (en) Fraud entity identification method, device and related equipment based on security intelligence
CN112532624A (en) Black chain detection method and device, electronic equipment and readable storage medium
CN110633412A (en) Page stay intention analysis method and device, computer equipment and storage medium
CN113535587A (en) Target application detection method and device and computer equipment
CN112019377B (en) Method, system, electronic device and storage medium for network user role identification
CN111666298A (en) Method and device for detecting user service class based on flink, and computer equipment
CN111340062A (en) Mapping relation determining method and device
CN111125704A (en) Webpage Trojan horse recognition method and system
CN113158187B (en) Method and device for detecting click hijacking and electronic equipment
CN113672497B (en) Method, device and equipment for generating non-buried point event and storage medium
CN113139182B (en) Data intrusion detection method for online e-commerce platform
CN112416500B (en) Information processing method and electronic equipment
CN112351008B (en) Network attack analysis method and device, readable storage medium and computer equipment
CN114389875A (en) Man-machine behavior detection method, system, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination