CN116627707A - Detection method and system for abnormal operation behavior of user - Google Patents

Detection method and system for abnormal operation behavior of user Download PDF

Info

Publication number
CN116627707A
CN116627707A CN202310890239.XA CN202310890239A CN116627707A CN 116627707 A CN116627707 A CN 116627707A CN 202310890239 A CN202310890239 A CN 202310890239A CN 116627707 A CN116627707 A CN 116627707A
Authority
CN
China
Prior art keywords
sequence
user
data
trend
abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310890239.XA
Other languages
Chinese (zh)
Inventor
段存明
郑传义
袁春峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongfu Safety Technology Co Ltd
Original Assignee
Zhongfu Safety Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongfu Safety Technology Co Ltd filed Critical Zhongfu Safety Technology Co Ltd
Priority to CN202310890239.XA priority Critical patent/CN116627707A/en
Publication of CN116627707A publication Critical patent/CN116627707A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/27Regression, e.g. linear or logistic regression
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Quality & Reliability (AREA)
  • Biomedical Technology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a method and a system for detecting abnormal operation behaviors of a user, and belongs to the technical field of information security. The method comprises the following steps: acquiring user operation behavior data and generating a time sequence; smoothing the time sequence by adopting a local weighted regression algorithm to generate seasonal components; generating a trend residual sequence by using the time sequence and the seasonal component, and smoothing the trend residual sequence to generate a trend component; calculating a sum of the seasonal component and the trend component as a baseline component; generating a residual sequence using the time sequence and the baseline component; performing quarter bit distance measurement on the residual sequence, and calculating to generate a measurement value; calculating a judging section according to the measurement value; and identifying abnormal points of the residual sequence by utilizing the judging section so as to determine abnormal time sequence data. The application adopts a mode based on base line and residual sequence decomposition to identify the abnormal operation data of the user, thereby effectively improving the accuracy of identifying the abnormal behavior of the user.

Description

Detection method and system for abnormal operation behavior of user
Technical Field
The application relates to the technical field of information security, in particular to a method and a system for detecting abnormal operation behaviors of a user.
Background
With the rapid development of network technology, the security of network information is receiving more and more attention. The key to the protection of network information security is the prediction and identification of abnormal operation behavior and attack behavior of users. Currently, anti-virus software is generally adopted to perform security protection on network information, while aiming at suspicious behaviors, potential threats and attacks which can not be detected by traditional anti-virus software, a baseline-based UEBA (User and Entity Behavior Analytics, user and entity behavior-based security analysis method) analysis method is adopted to detect and identify potential security threats and abnormal activities.
In a baseline based UEBA analysis, a baseline model is first established that represents the normal behavior patterns of users and entities. The baseline model may be constructed based on historical data or predefined rules. Abnormal behavior is then detected by comparing the real-time data to the baseline model. The baseline based UEBA analysis step generally includes:
and (3) data collection: logs, events, and metrics data including user and entity behavior data are collected. Such data may include login activity, file access, network communications, rights changes, and the like.
Feature extraction: meaningful features are extracted from the collected data for describing the behavior of the user and the entity. Features may include time stamps, frequency of behavior, duration of behavior, type of behavior, etc.
Baseline modeling: a baseline model is constructed using historical data or predefined rules to describe normal behavior of users and entities. The baseline model may be constructed based on statistical analysis methods or machine learning algorithms.
Abnormality detection: the real-time data is compared with the baseline model, and behavior which is significantly different from the baseline model is detected and identified. These differences may represent potential security threats or abnormal activities. Common anomaly detection methods include threshold detection, outlier detection, machine learning classification, and the like.
Baseline-based UEBA analysis may help organizations discover potential internal and external threats and provide timely security responses. By monitoring and analyzing the behaviors of users and entities, abnormal activities and unusual modes are identified, so that the safety is improved and sensitive data are protected. However, this approach still has some of the following objective drawbacks:
1. since constructing the baseline model requires consideration of a number of factors, such as differences between different users and entities, variations in different time periods, and the like. For complex environments and varying patterns of behavior, it may be difficult to construct an appropriate baseline model.
2. The UEBA analysis often faces the problem of false alarm, and the high false alarm rate may reduce the reliability of the analysis result, and increase the load of verification and confirmation of the analysis result.
3. The baseline model requires constant maintenance and updating to accommodate changing environmental and behavioral patterns. However, in the prior art, a dynamic update strategy of the baseline data is lacking, and is behind the service requirement, so that the baseline threshold value is inaccurate, and a false-alarm analysis result appears.
Disclosure of Invention
Aiming at the problems existing in the prior art, the application aims to provide a method and a system for detecting abnormal operation behaviors of a user, which are used for identifying abnormal operation data of the user in a mode of decomposing based on a base line and a residual sequence, so that the accuracy of identifying the abnormal operation behaviors of the user is effectively improved.
The application aims to achieve the aim, and the aim is achieved by the following technical scheme:
a detection method of abnormal operation behavior of a user comprises the following steps:
acquiring user operation behavior data, and generating a time sequence according to the time characteristics of the data;
smoothing the time sequence by adopting a local weighted regression algorithm to generate seasonal components;
generating a trend residual sequence by using the time sequence and the seasonal component, and smoothing the trend residual sequence by adopting a local weighted regression algorithm to generate a trend component;
calculating a sum of the seasonal component and the trend component as a baseline component;
generating a residual sequence using the time sequence and the baseline component;
performing quarter bit distance measurement on the residual sequence, and calculating to generate a measurement value;
calculating a user abnormal behavior judgment section according to the measurement value;
and identifying abnormal points of the residual sequence by using the abnormal behavior judgment section of the user so as to determine abnormal time sequence data.
Further, the obtaining operation behavior data of the user and generating a time sequence according to time characteristics of the data includes:
and acquiring operation behavior data of the user, and generating a time sequence in a time aggregation mode according to time characteristics of the data.
Further, the smoothing the time sequence by adopting a local weighted regression algorithm to generate seasonal components includes:
smoothing the time sequence data in a time window by adopting a local weighted regression algorithm, and storing trend characteristics of the time sequence data;
determining seasonal components by calculating a moving average of the smoothed time series;
and when the moving average value is calculated, the adopted time window is a time window matched with the seasonal period.
Further, the generating a trend residual sequence by using the time sequence and the seasonal component, and smoothing the trend residual sequence by adopting a local weighted regression algorithm, generating a trend component includes:
subtracting the seasonal component from the time sequence to generate a trend residual sequence;
and smoothing the trend residual sequence by adopting a local weighted regression algorithm to generate a trend component.
Further, the local weighted regression algorithm includes the steps of:
step 1: let the data points in the time series be%,/>) The objective function of defining the weighted regression is as follows:
wherein ,is the weight function of the ith data point, x is the position of the point to be smoothed, +.>Is the location of the ith data point, +.>Is a smoothing parameter, i.e. a bandwidth function, for controlling the distribution of weights;
step 2: performing least square regression, and fitting a local polynomial model; let the local polynomial model be:
by-pass square objective functionDetermining coefficients β of the local polynomial model, wherein n is the total number of data points, +.>Is the response value of the ith data point;
step 3: selection ofCalculating a weight according to the distance of the data points;
the bandwidth function is defined as:k × median(|/>-/>|);
where k is a bandwidth adjustment factor for controlling the size of the bandwidth;
step 4: by minimizing the objective function SAnd calculating a smoothing estimated value of each point to be smoothed.
Further, the generating a residual sequence using the time sequence and the baseline component includes:
the baseline component is subtracted from the time series to obtain a residual sequence.
Further, the performing a quarter-bit distance measurement on the residual sequence, and calculating to generate a measurement value includes:
performing quartile range measurement on the residual sequence to obtain a lower quartile Q1 and an upper quartile Q3;
using the formula iqr=q3-Q1, a quarter-bit distance IQR is generated.
Further, the calculating the abnormal behavior determination section of the user according to the measurement value includes:
calculating a lower limit value A and an upper limit value B of the abnormal behavior judgment section of the user according to formulas A=Q1-kIQR and B=Q3+kIQR;
and taking the [ A, B ] as a user abnormal behavior judgment section.
Further, the identifying abnormal points of the residual sequence by using the abnormal behavior determination section of the user to determine abnormal time series data includes:
judging whether the observed value of the residual sequence belongs to the interval [ A, B ];
if yes, marking the corresponding time series data as normal points; if not, marking the corresponding time series data as abnormal points;
generating an abnormal marking sequence according to the marked abnormal points in the time sequence, wherein the corresponding user operation behavior data have user abnormal operation behaviors.
Correspondingly, the application also discloses a system for detecting the abnormal operation behavior of the user, which comprises the following steps:
the data acquisition unit is configured to acquire user operation behavior data and generate a time sequence according to the time characteristics of the data;
the seasonal decomposition unit is configured to carry out smoothing treatment on the time sequence by adopting a local weighted regression algorithm to generate a seasonal component;
the trend decomposition unit is configured to generate a trend residual sequence by using the time sequence and the seasonal component, and carry out smoothing treatment on the trend residual sequence by adopting a local weighted regression algorithm to generate a trend component;
a base line creation unit configured to calculate a sum of seasonal components and trend components as a base line component;
a residual sequence creation unit configured to generate a residual sequence using the time sequence and the baseline component;
the first calculation unit is configured to perform quarter bit distance measurement on the residual sequence and calculate a generated measurement value;
the second calculation unit is configured to calculate a user abnormal behavior judgment section according to the measurement value;
and an identification unit configured to identify an abnormal point of the residual sequence using the user abnormal behavior determination section to determine abnormal time-series data.
Compared with the prior art, the application has the beneficial effects that: the application provides a detection method and a detection system for abnormal operation behaviors of a user, wherein a Trend component (Trend), a Seasonal component (Seasonal) and a Residual component (Residual) are taken as UEBA consideration factors through decomposition of a time sequence, so that the error recognition rate is greatly reduced compared with the traditional baseline analysis based on data statistics. Because the generation and updating of the traditional base line are behind the change of service data, and meanwhile, the trend component and the period component are not taken into consideration in the analysis of the UEBA, the service has higher false recognition rate when the service changes along with seasons or economic periods.
It can be seen that the present application has outstanding substantial features and significant advances over the prior art, as well as the benefits of its implementation.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method of an embodiment of the present application.
Fig. 2 is a system configuration diagram of an embodiment of the present application.
In the figure, 1, a data acquisition unit; 2. a seasonal decomposition unit; 3. a trend decomposition unit; 4. a base line creation unit; 5. a residual sequence creation unit; 6. a first calculation unit; 7. a second calculation unit; 8. and an identification unit.
Detailed Description
The following describes specific embodiments of the present application with reference to the drawings.
The method for detecting the abnormal operation behavior of the user shown in fig. 1 comprises the following steps:
s1: and acquiring user operation behavior data, and generating a time sequence according to the time characteristics of the data.
Specifically, operation behavior data of a user is obtained, and a time sequence is generated in a time aggregation mode according to time characteristics of the data.
It should be noted that in the present method, the composition of the time series includes a trend period component, a seasonal component, and a remainder component (any other content of the time series), wherein the trend and period are combined into a trend period component. The additive decomposition of the time series can be expressed as:=/>,/>is data, & lt + & gt>Is seasonal ingredient, is->Is a trend period component, ++>The remainder, the multiplicative decomposition of the time series, can be expressed as: />=/>The time sequence decomposition algorithm includes Moving images, classical decomposition, X11 decompensation, SEATS decomposition and STL decomposition.
S2: and smoothing the time sequence by adopting a local weighted regression algorithm to generate seasonal components.
Specifically, the objective of this step is to smooth the time series to reduce the effect of noise. In a specific embodiment, a local weighted regression (Loess) algorithm is adopted for smoothing, so that data in a time window is smoothed, and meanwhile, overall trend characteristics are reserved. The seasonal component is estimated by calculating a moving average for the smoothed time series. The time window size of the moving average is typically matched to the seasonal period to capture seasonal variations.
The local weighted regression (Loess) algorithm adopted by the method specifically comprises the following steps:
step 1: let the data points in the time series be%,/>) The objective function of defining the weighted regression is as follows:
wherein ,is the weight function of the ith data point, x is the position of the point to be smoothed, +.>Is the location of the ith data point, +.>Is a smoothing parameter, i.e. a bandwidth function, for controlling the distribution of weights.
Step 2: performing least square regression, and fitting a local polynomial model; common polynomial models are linear models (first order polynomials) or quadratic models (second order polynomials). Assume the local polynomial model is:
by-pass square objective functionDetermining coefficients β of the local polynomial model, wherein n is the total number of data points, +.>Is the response value of the i-th data point.
Step 3: selection ofAnd calculates the weight based on the distance of the data points.
The bandwidth function is defined as:k × median(|/>-/>|);
where k is a bandwidth adjustment factor for controlling the size of the bandwidth.
Step 4: by minimizing the objective function SAnd calculating a smoothing estimated value of each point to be smoothed.
S3: and generating a trend residual sequence by using the time sequence and the seasonal component, and smoothing the trend residual sequence by adopting a local weighted regression algorithm to generate a trend component.
On the basis of seasonal decomposition, a trend residual sequence is obtained by subtracting the seasonal component from the time sequence. A Loess smoothing method is applied to the trend residual sequence to estimate trend components.
S4: the sum of the seasonal component and the trend component is calculated as the baseline component.
The purpose of this step is to create a baseline, which is the sum of the periodic and trend components.
S5: a residual sequence is generated using the time sequence and the baseline component.
Specifically, on the basis of trend decomposition, the estimated seasonal and trend components are subtracted from the original time sequence, i.e., the residual sequence is the X-baseline component, resulting in a residual sequence. The residual sequence represents portions of the original data that cannot be interpreted by trends and seasonally, i.e., random noise and aperiodic variations (e.g., abnormal behavioral components).
As can be seen from the above steps, the method adopts STL (Seasonal and Trend decomposition using Loess) time series decomposition algorithm to decompose the time series. Specifically, based on the idea of local weighted regression (Loess), the time series is decomposed into three parts, trend (Trend), seasonal (Seasonal), residual (Residual) in an iterative manner.
Compared with the Moving tools, classification, X11 and SEATS decomposition algorithms commonly used in the prior art, the STL decomposition algorithm has the following advantages:
(1) In contrast to SEATS and X11, STL may handle any type of seasonal, not just monthly and quarterly data.
(2) The seasonal ingredient is allowed to change over time and the rate of change may be controlled by the user.
(3) The smoothness of the trend period may also be controlled by the user.
(4) The outlier is robust and occasional anomalous observations do not affect the estimation of trend periods and seasonal components.
S6: and performing quarter bit distance measurement on the residual sequence, and calculating to generate a measurement value.
Firstly, carrying out quartile range measurement on a residual sequence to obtain a lower quartile Q1 and an upper quartile Q3; then, using the formula iqr=q3-Q1, the quarter-bit distance IQR is generated.
In particular embodiments, to detect outliers on the residual sequence, the algorithm employed is Tukey features, which employs a quarter-bit distance (IQR) metric, i.e., the spread of data. IQR is referred to as medium speed, middle 50%, fourth diffusion, or H-point difference. IQR is defined as the difference between the 75 th and 25 th percentiles of data. For the calculation of IQR, the data set is divided into quartiles, represented by Q1 (also called lower quartiles), Q2 (median), and Q3 (also called upper quartiles). The lower quartile corresponds to the 25 th percentile and the upper quartile corresponds to the 75 th percentile, so the calculation formula is: iqr=q3-Q1.
S7: and calculating a user abnormal behavior judgment section according to the measurement value.
Specifically, according to formulas a=q1-kIQR and b=q3+kiqr, a lower limit value a and an upper limit value B of the user abnormal behavior determination section are calculated; and taking the [ A, B ] as a user abnormal behavior judgment section.
In particular embodiments, by measuring observations on residual components by quartile range, Q1 and Q3 are the lower and upper quartiles, respectively, an outlier can be defined as any observation outside of the range: [ Q1-k (Q3-Q1), q3+k (Q3-Q1) ] wherein k=1.5 represents an "outlier", i.e. an outlier is defined as an observed value below Q1-1.5×iqr or above q3+1.5×iqr.
S8: and identifying abnormal points of the residual sequence by using the abnormal behavior judgment section of the user so as to determine abnormal time sequence data.
Specifically, first, whether an observed value of a residual sequence belongs to a section [ A, B ] is judged; if yes, marking the corresponding time series data as normal points; if not, marking the corresponding time series data as abnormal points. And finally, generating an abnormal marking sequence according to the abnormal points marked in the time sequence, wherein the corresponding user operation behavior data have user abnormal operation behaviors.
In a specific embodiment, the decision interval is first calculated according to formulas Q1-kIQR and Q3+kIQR. Outliers may be defined as observations below Q1-1.5×iqr or above q3+1.5×iqr, depending on the particular needs.
Observations below Q1-1.5×iqr or above q3+1.5×iqr are marked as outliers when outliers are detected on the residual sequence. Accordingly, if the observed value is within the range of [ Q1-k (Q3-Q1), Q3+k (Q3-Q1) ] it is marked as a normal point. And finally, generating an abnormal mark sequence according to the marked abnormal points.
Correspondingly, as shown in fig. 2, the application also discloses a system for detecting abnormal operation behaviors of a user, which comprises the following steps: a data acquisition unit 1, a seasonal decomposition unit 2, a trend decomposition unit 3, a baseline creation unit 4, a residual sequence creation unit 5, a first calculation unit 6, a second calculation unit 7 and an identification unit 8.
The data acquisition unit 1 is configured to acquire user operation behavior data and generate a time sequence according to time characteristics of the data.
In a specific embodiment, the data acquisition unit 1 is specifically configured to: and acquiring operation behavior data of the user, and generating a time sequence in a time aggregation mode according to time characteristics of the data.
And a seasonal decomposition unit 2 configured to smooth the time series by using a local weighted regression algorithm to generate a seasonal component.
In a specific embodiment, the seasonal decomposition unit 2 is specifically configured to: smoothing the time sequence data in a time window by adopting a local weighted regression algorithm, and storing trend characteristics of the time sequence data; determining seasonal components by calculating a moving average of the smoothed time series; and when the moving average value is calculated, the adopted time window is a time window matched with the seasonal period.
And a trend decomposition unit 3 configured to generate a trend residual sequence by using the time sequence and the seasonal component, and to perform smoothing processing on the trend residual sequence by using a local weighted regression algorithm to generate a trend component.
In the specific embodiment, the trend decomposing unit 3 is specifically configured to: subtracting the seasonal component from the time sequence to generate a trend residual sequence; and smoothing the trend residual sequence by adopting a local weighted regression algorithm to generate a trend component.
A base line creating unit 4 configured to calculate the sum of the seasonal component and the trend component as a base line component.
A residual sequence creation unit 5 configured to generate a residual sequence using the time sequence and the baseline component.
In a specific embodiment, the residual sequence creation unit 5 is specifically configured to: the baseline component is subtracted from the time series to obtain a residual sequence.
A first calculation unit 6 configured to perform a quarter-bit distance measurement on the residual sequence, and calculate a generated measurement value.
In a specific embodiment, the first computing unit 6 is specifically configured to: performing quartile range measurement on the residual sequence to obtain a lower quartile Q1 and an upper quartile Q3; using the formula iqr=q3-Q1, a quarter-bit distance IQR is generated.
A second calculation unit 7 configured to calculate a user abnormal behavior determination section from the metric value.
In a specific embodiment, the second computing unit 7 is specifically configured to: calculating a lower limit value A and an upper limit value B of the abnormal behavior judgment section of the user according to formulas A=Q1-kIQR and B=Q3+kIQR; and taking the [ A, B ] as a user abnormal behavior judgment section.
An identifying unit 8 configured to identify an abnormal point of the residual sequence using the user abnormal behavior determination section to determine abnormal time-series data.
In a specific embodiment, the identification unit 8 is specifically configured to: judging whether the observed value of the residual sequence belongs to the interval [ A, B ]; if yes, marking the corresponding time series data as normal points; if not, marking the corresponding time series data as abnormal points; generating an abnormal marking sequence according to the marked abnormal points in the time sequence, wherein the corresponding user operation behavior data have user abnormal operation behaviors.
It will be apparent to those skilled in the art that the techniques of embodiments of the present application may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solution in the embodiments of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium such as a U-disc, a mobile hard disc, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or an optical disk, etc. various media capable of storing program codes, including several instructions for causing a computer terminal (which may be a personal computer, a server, or a second terminal, a network terminal, etc.) to execute all or part of the steps of the method described in the embodiments of the present application. The same or similar parts between the various embodiments in this specification are referred to each other. In particular, for the terminal embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and reference should be made to the description in the method embodiment for relevant points.
In the several embodiments provided by the present application, it should be understood that the disclosed systems, and methods may be implemented in other ways. For example, the system embodiments described above are merely illustrative, e.g., the division of the elements is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interface, system or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present application may be integrated in one processing unit, or each module may exist alone physically, or two or more modules may be integrated in one unit.
Similarly, each processing unit in the embodiments of the present application may be integrated in one functional module, or each processing unit may exist physically, or two or more processing units may be integrated in one functional module.
The application will be further described with reference to the accompanying drawings and specific embodiments. It is to be understood that these examples are illustrative of the present application and are not intended to limit the scope of the present application. Further, it will be understood that various changes and modifications may be made by those skilled in the art after reading the teachings of the application, and equivalents thereof fall within the scope of the application as defined by the claims.

Claims (10)

1. A method for detecting abnormal operation behavior of a user, comprising:
acquiring user operation behavior data, and generating a time sequence according to the time characteristics of the data;
smoothing the time sequence by adopting a local weighted regression algorithm to generate seasonal components;
generating a trend residual sequence by using the time sequence and the seasonal component, and smoothing the trend residual sequence by adopting a local weighted regression algorithm to generate a trend component;
calculating a sum of the seasonal component and the trend component as a baseline component;
generating a residual sequence using the time sequence and the baseline component;
performing quarter bit distance measurement on the residual sequence, and calculating to generate a measurement value;
calculating a user abnormal behavior judgment section according to the measurement value;
and identifying abnormal points of the residual sequence by using the abnormal behavior judgment section of the user so as to determine abnormal time sequence data.
2. The method for detecting abnormal operation behavior of a user according to claim 1, wherein the acquiring operation behavior data of the user and generating a time series according to time characteristics of the data comprises:
and acquiring operation behavior data of the user, and generating a time sequence in a time aggregation mode according to time characteristics of the data.
3. The method for detecting abnormal operation behavior of a user according to claim 2, wherein smoothing the time series by using a locally weighted regression algorithm to generate seasonal components comprises:
smoothing the time sequence data in a time window by adopting a local weighted regression algorithm, and storing trend characteristics of the time sequence data;
determining seasonal components by calculating a moving average of the smoothed time series;
and when the moving average value is calculated, the adopted time window is a time window matched with the seasonal period.
4. The method for detecting abnormal operation behavior of a user according to claim 3, wherein the generating a trend residual sequence using a time sequence and a seasonal component, and smoothing the trend residual sequence using a local weighted regression algorithm, generating a trend component, comprises:
subtracting the seasonal component from the time sequence to generate a trend residual sequence;
and smoothing the trend residual sequence by adopting a local weighted regression algorithm to generate a trend component.
5. The method for detecting abnormal operation behavior of a user according to claim 4, wherein the local weighted regression algorithm comprises the steps of:
step 1: let the data points in the time series be%,/>) The objective function of defining the weighted regression is as follows:
wherein ,is the weight function of the ith data point, x is the position of the point to be smoothed, +.>Is the location of the ith data point, +.>Is a smoothing parameter, i.e. a bandwidth function, for controlling the distribution of weights;
step 2: performing least square regression, and fitting a local polynomial model; let the local polynomial model be:
by-pass square objective functionDetermining coefficients β of the local polynomial model, wherein n is the total number of data points, +.>Is the response value of the ith data point;
step 3: selection ofCalculating a weight according to the distance of the data points;
the bandwidth function is defined as: k × median(|/> -/>|);
where k is a bandwidth adjustment factor for controlling the size of the bandwidth;
step 4: by minimizing the objective function SAnd calculating a smoothing estimated value of each point to be smoothed.
6. The method for detecting abnormal operation behavior of a user according to claim 4, wherein generating a residual sequence using a time sequence and a baseline component comprises:
the baseline component is subtracted from the time series to obtain a residual sequence.
7. The method for detecting abnormal operation behavior of a user according to claim 6, wherein the performing a quarter-bit distance measurement on the residual sequence, calculating a generated measurement value, comprises:
performing quartile range measurement on the residual sequence to obtain a lower quartile Q1 and an upper quartile Q3;
using the formula iqr=q3-Q1, a quarter-bit distance IQR is generated.
8. The method for detecting abnormal operation behavior of a user according to claim 7, wherein calculating the abnormal operation behavior determination section of the user based on the metric value comprises:
calculating a lower limit value A and an upper limit value B of the abnormal behavior judgment section of the user according to formulas A=Q1-kIQR and B=Q3+kIQR;
and taking the [ A, B ] as a user abnormal behavior judgment section.
9. The method for detecting abnormal operation behavior of a user according to claim 8, wherein the identifying abnormal points of the residual sequence by using the abnormal operation behavior determination section of the user to determine abnormal time series data comprises:
judging whether the observed value of the residual sequence belongs to the interval [ A, B ];
if yes, marking the corresponding time series data as normal points; if not, marking the corresponding time series data as abnormal points;
generating an abnormal marking sequence according to the marked abnormal points in the time sequence, wherein the corresponding user operation behavior data have user abnormal operation behaviors.
10. A system for detecting abnormal operation behavior of a user, comprising:
the data acquisition unit is configured to acquire user operation behavior data and generate a time sequence according to the time characteristics of the data;
the seasonal decomposition unit is configured to carry out smoothing treatment on the time sequence by adopting a local weighted regression algorithm to generate a seasonal component;
the trend decomposition unit is configured to generate a trend residual sequence by using the time sequence and the seasonal component, and carry out smoothing treatment on the trend residual sequence by adopting a local weighted regression algorithm to generate a trend component;
a base line creation unit configured to calculate a sum of seasonal components and trend components as a base line component;
a residual sequence creation unit configured to generate a residual sequence using the time sequence and the baseline component;
the first calculation unit is configured to perform quarter bit distance measurement on the residual sequence and calculate a generated measurement value;
the second calculation unit is configured to calculate a user abnormal behavior judgment section according to the measurement value;
and an identification unit configured to identify an abnormal point of the residual sequence using the user abnormal behavior determination section to determine abnormal time-series data.
CN202310890239.XA 2023-07-20 2023-07-20 Detection method and system for abnormal operation behavior of user Pending CN116627707A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310890239.XA CN116627707A (en) 2023-07-20 2023-07-20 Detection method and system for abnormal operation behavior of user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310890239.XA CN116627707A (en) 2023-07-20 2023-07-20 Detection method and system for abnormal operation behavior of user

Publications (1)

Publication Number Publication Date
CN116627707A true CN116627707A (en) 2023-08-22

Family

ID=87602877

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310890239.XA Pending CN116627707A (en) 2023-07-20 2023-07-20 Detection method and system for abnormal operation behavior of user

Country Status (1)

Country Link
CN (1) CN116627707A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117310118A (en) * 2023-11-28 2023-12-29 济南中安数码科技有限公司 Visual monitoring method for groundwater pollution
CN117350508A (en) * 2023-10-31 2024-01-05 深圳市黑云精密工业有限公司 Production work order distribution system based on real-time acquisition data of production line collector
CN117421610A (en) * 2023-12-19 2024-01-19 山东德源电力科技股份有限公司 Data anomaly analysis method for electric energy meter running state early warning
CN117648590A (en) * 2024-01-30 2024-03-05 山东万洋石油科技有限公司 Omnibearing gamma logging data optimization processing method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111324639A (en) * 2020-02-11 2020-06-23 京东数字科技控股有限公司 Data monitoring method and device and computer readable storage medium
WO2020127656A1 (en) * 2018-12-20 2020-06-25 Worldline Anomaly detection in data flows with confidence intervals
CN111444168A (en) * 2020-03-26 2020-07-24 易电务(北京)科技有限公司 Distribution room transformer daily maximum load abnormal data detection processing method
CN112965876A (en) * 2021-03-10 2021-06-15 中国民航信息网络股份有限公司 Monitoring alarm method and device
CN112966222A (en) * 2021-03-10 2021-06-15 中国民航信息网络股份有限公司 Time series abnormal data detection method and related equipment
CN114218009A (en) * 2021-12-30 2022-03-22 山东云海国创云计算装备产业创新中心有限公司 Time series abnormal value detection method, device, equipment and storage medium
WO2022117911A1 (en) * 2020-12-04 2022-06-09 Elisa Oyj Anomaly detection

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020127656A1 (en) * 2018-12-20 2020-06-25 Worldline Anomaly detection in data flows with confidence intervals
CN111324639A (en) * 2020-02-11 2020-06-23 京东数字科技控股有限公司 Data monitoring method and device and computer readable storage medium
CN111444168A (en) * 2020-03-26 2020-07-24 易电务(北京)科技有限公司 Distribution room transformer daily maximum load abnormal data detection processing method
WO2022117911A1 (en) * 2020-12-04 2022-06-09 Elisa Oyj Anomaly detection
CN112965876A (en) * 2021-03-10 2021-06-15 中国民航信息网络股份有限公司 Monitoring alarm method and device
CN112966222A (en) * 2021-03-10 2021-06-15 中国民航信息网络股份有限公司 Time series abnormal data detection method and related equipment
CN114218009A (en) * 2021-12-30 2022-03-22 山东云海国创云计算装备产业创新中心有限公司 Time series abnormal value detection method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
朱双 等: "《流域水文分析与中长期预报方法》", vol. 1, 中国地质大学出版社, pages: 50 - 51 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117350508A (en) * 2023-10-31 2024-01-05 深圳市黑云精密工业有限公司 Production work order distribution system based on real-time acquisition data of production line collector
CN117310118A (en) * 2023-11-28 2023-12-29 济南中安数码科技有限公司 Visual monitoring method for groundwater pollution
CN117310118B (en) * 2023-11-28 2024-03-08 济南中安数码科技有限公司 Visual monitoring method for groundwater pollution
CN117421610A (en) * 2023-12-19 2024-01-19 山东德源电力科技股份有限公司 Data anomaly analysis method for electric energy meter running state early warning
CN117421610B (en) * 2023-12-19 2024-03-15 山东德源电力科技股份有限公司 Data anomaly analysis method for electric energy meter running state early warning
CN117648590A (en) * 2024-01-30 2024-03-05 山东万洋石油科技有限公司 Omnibearing gamma logging data optimization processing method
CN117648590B (en) * 2024-01-30 2024-04-19 山东万洋石油科技有限公司 Omnibearing gamma logging data optimization processing method

Similar Documents

Publication Publication Date Title
CN116627707A (en) Detection method and system for abnormal operation behavior of user
CN112257063B (en) Cooperative game theory-based detection method for backdoor attacks in federal learning
TWI595375B (en) Anomaly detection using adaptive behavioral profiles
CN107493277B (en) Large data platform online anomaly detection method based on maximum information coefficient
EP2069993B1 (en) Security system and method for detecting intrusion in a computerized system
KR102464390B1 (en) Method and apparatus for detecting anomaly based on behavior analysis
US10437696B2 (en) Proactive information technology infrastructure management
Ye et al. EWMA forecast of normal system activity for computer intrusion detection
CN109522948A (en) A kind of fault detection method based on orthogonal locality preserving projections
CN116112292B (en) Abnormal behavior detection method, system and medium based on network flow big data
Bai et al. Automatic detection and removal of high‐density impulse noises
Ahmadi et al. A new false data injection attack detection model for cyberattack resilient energy forecasting
CN109873832B (en) Flow identification method and device, electronic equipment and storage medium
CN102045358A (en) Intrusion detection method based on integral correlation analysis and hierarchical clustering
CN112149749A (en) Abnormal behavior detection method and device, electronic equipment and readable storage medium
CN107679626A (en) Machine learning method, device, system, storage medium and equipment
CN116450482A (en) User abnormality monitoring method and device, electronic equipment and storage medium
Liu et al. Online conditional outlier detection in nonstationary time series
CN115049410A (en) Electricity stealing behavior identification method and device, electronic equipment and computer readable storage medium
Smith et al. Testing probabilistic adaptive real‐time flood forecasting models
EP4116853B1 (en) Computer-readable recording medium storing evaluation program, evaluation method, and information processing device
CN113971119B (en) Unsupervised model-based user behavior anomaly analysis and evaluation method and system
Zhang et al. Causal direction inference for network alarm analysis
CN114050941B (en) Defect account detection method and system based on kernel density estimation
Sheikhrabori et al. Maximum likelihood estimation of change point from stationary to nonstationary in autoregressive models using dynamic linear model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20230822