CA3132346C - User abnormal behavior recognition method and device and computer readable storage medium - Google Patents

User abnormal behavior recognition method and device and computer readable storage medium Download PDF

Info

Publication number
CA3132346C
CA3132346C CA3132346A CA3132346A CA3132346C CA 3132346 C CA3132346 C CA 3132346C CA 3132346 A CA3132346 A CA 3132346A CA 3132346 A CA3132346 A CA 3132346A CA 3132346 C CA3132346 C CA 3132346C
Authority
CA
Canada
Prior art keywords
data
user
neural network
neuron
winning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CA3132346A
Other languages
French (fr)
Other versions
CA3132346A1 (en
Inventor
Yiwen Li
Xin Huang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
10353744 Canada Ltd
Original Assignee
10353744 Canada Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 10353744 Canada Ltd filed Critical 10353744 Canada Ltd
Publication of CA3132346A1 publication Critical patent/CA3132346A1/en
Application granted granted Critical
Publication of CA3132346C publication Critical patent/CA3132346C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention discloses to user anomaly behavior identification method, apparatus, and computer readable storage medium from computer technology field. The method comprises: obtaining time series data and spatial series data associated with user behavior; according to a plurality of actual indicators values before pre-set time point in the time series data, predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point; comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for user behavior; according to spatial series data, performing anomaly detection through pre- trained SOM neural network model, obtaining second detection result for user behavior; according to the first detection result and the second detection result, performing anomaly identification on user behavior. The implementations of present invention can achieve accurate and reliable identification of user anomaly behavior.

Description

USER ABNORMAL BEHAVIOR RECOGNITION METHOD AND DEVICE AND COMPUTER
READABLE STORAGE MEDIUM
Field [0001] The present disclosure relates to field of computer technology, particularly to a user anomaly behavior method, apparatus, and computer storage medium.
Background
[0002] Information security is an increasingly prominent topic among people.
The theft of commonly used network and app accounts may cause information leakage, funds are transferred, or being used as a springboard for a series of attacks on important assets. Many industries do not have clear identification and tracking method, therefore, the biggest victims are often users themselves. Due to the difference in account permissions, it is difficult to simply judge the extent of activities as illegal behaviors, due to the complexity of business, it is difficult to accurately determine whether the account is under normal status or anomaly status.
An anomaly status is a phenomenon or an event generated by various anomaly activities that is not consistent with user's routine.
[0003] At present, the identification of user anomaly behavior usually adopts K-Means clustering which is an unsupervised machine learning algorithm, but K-Means algorithm needs to determine the number of classes (k) in advance, and after finding the most similar class for each input data, only updating the parameter of this class, so the result of each time is unstable due to the influence of the initial value and the noise data, as a result, risky users cannot be accurately and reliably identified.
Invention Content
[0004] To solve the problems in above-mentioned technical background, the present invention provides a user anomaly behavior identification method, apparatus and computer readable storage medium, to achieve accurate and reliable identification of user anomaly behavior.
[0005] The technical solutions provided in implementations of the present invention are as following:

Date recue / Date received 202 1-1 1-29
[0006] The first aspect provides a user anomaly behavior identification method, comprising:
[0007] Obtaining time series data and spatial series data associated with user behavior;
[0008] According to a plurality of actual indicators values before pre-set time point in the time series data, predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point;
[0009] Comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for the user behavior;
[0010] According to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining second detection result for the user behavior;
[0011] According to the first detection result and the second detection result, performing anomaly identification on the user behavior.
[0012] Furthermore, the ARIMA model is constructed by following methods:
[0013] Obtaining time series sample data associated with sample user behavior;
[0014] Performing stationarity test on the time series sample data, for failing the test's time series sample data, differential processing the data to obtain stationary time series sample data;
[0015] For the stationary time series sample data, establishing an initial ARIMA model, determining the initial ARIMA model's autoregressive order and range of moving average order according to autocorrelation coefficient and partial autocorrelation coefficient of the stationary time series sample data;
[0016] Using AIC information criterion, determining the combination of the initial ARIMA model's optimal autoregressive order and range of moving average order, constructing the ARIMA
model.

Date recue / Date received 202 1-1 1-29
[0017] Furthermore, the SOM neural network model is trained by following methods to obtain:
[0018] Si, initializing weight of each neuron in the pre-set SOM neural network;
[0019] S2, obtaining spatial series sample data associated with behavior of sample user, normalization processing each spatial series sample data, obtaining training sample set;
[0020] S3, randomly selecting training samples from the training sample set to be input into the SOM neural network input layer, obtaining input vector;
[0021] S4, according to Euclidean distance between the input vector and each neuron in competition layer of the SOM neural network, searching for winning neuron corresponding to the input vector;
[0022] S5, using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron;
[0023] S6, iteratively execute step S3 to S5, the training ends until reaching the pre-set end condition, obtaining the SOM neural network, and obtaining a plurality of clusters output by the SOM neural network model.
[0024] Furthermore, according to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining a second detection result for the user behavior, comprising:
[0025] Normalization processing the spatial series data, regarding the normalization processed spatial series data as input parameters, the parameters are input to the SOM neural network model, according to the Euclidean distance from the input parameters to each neuron, determining the winning neurons corresponding to the input parameters and cluster of the winning neurons;
[0026] Calculating cluster area of the winning neurons, comparing the cluster area with an area threshold, wherein, only when the cluster area is less than the area threshold, the cluster is an anomaly cluster;

Date recue / Date received 202 1-1 1-29
[0027] According to the comparing result, generating a second detection result for user behavior.
[0028] Furthermore, according to the first detection result and the second detection result, after anomaly recognition of user behavior, the method comprises:
[0029] If the identification result of user behavior is the user anomaly behavior, then performing identification authentication on the user, or restricting the user's operations and behavior.
[0030] The second aspect provides an apparatus for identifying user anomaly behavior, the apparatus comprises:
[0031] A data obtaining module configured to obtain time series data and spatial series data associated with user behavior;
[0032] A first detection module configured to predict confidence interval of the indicator through ARIMA
model when user is at the pre-set time point according to a plurality of actual indicators values before pre-set time point in the time series data; compare actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtain first detection result for the user behavior;
[0033] A second detection module configured to perform anomaly detection through pre-trained SOM
neural network model according to the spatial series data and obtain second detection result for the user behavior;
[0034] An anomaly identification module configured to perform anomaly identification on the user behavior according to the first detection result and the second detection result.
[0035] Furthermore, the apparatus also comprises construction module, the construction module is specifically for:
[0036] Obtaining time series sample data associated with sample user behavior;

Date recue / Date received 202 1-1 1-29
[0037] Performing stationarity test on the time series sample data, for failing the test's time series sample data, differential processing the data to obtain stationary time series sample data;
[0038] For the stationary time series sample data, establishing an initial ARIMA model, determining the initial ARIMA model's autoregressive order and range of moving average order according to autocorrelation coefficient and partial autocorrelation coefficient of the stationary time series sample data;
[0039] Using AIC information criterion, determining the combination of the initial ARIMA model's optimal autoregressive order and range of moving average order, constructing the ARIMA
model.
[0040] Furthermore, the apparatus also comprises training module, the training module comprises:
[0041] An initializing submodule configured to initialize weight of each neuron in the pre-set SOM neural network;
[0042] A pre-processing submodule configured to obtain spatial series sample data associated with behavior of sample user, normalization processing each spatial series sample data, obtaining training sample set;
[0043] A training submodule configured to randomly select training samples from the training sample set to be input into the SOM neural network input layer and obtain input vector, according to Euclidean distance between the input vector and each neuron in competition layer of the SOM
neural network, searching for winning neuron corresponding to the input vector, using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron;
[0044] An iteration submodule configured to iteratively execute steps in training submodule, the training ends until reaching the pre-set end condition, obtaining the SOM neural network, and obtaining a plurality of clusters output by the SOM neural network model.
[0045] Furthermore, the second detection module is specifically for:
[0046] Normalization processing the spatial series data, regarding the normalization processed spatial series Date recue / Date received 202 1-1 1-29 data as input parameters, the parameters are input to the SOM neural network model, according to the Euclidean distance from the input parameters to each neuron, determining the winning neurons corresponding to the input parameters and cluster of the winning neurons;
[0047] Calculating cluster area of the winning neurons, comparing the cluster area with an area threshold, wherein, only when the cluster area is less than the area threshold, the cluster is an anomaly cluster;
[0048] According to the comparing result, generating a second detection result for user behavior.
[0049] Furthermore, the apparatus also comprises anomaly processing module, the anomaly processing module is specifically for:
[0050] if the identification result of user behavior is the user anomaly behavior, then performing identification authentication on the user, or restricting the user's operations and behavior.
[0051] The third aspect provides a computer device, comprising:
[0052] one or a plurality of processors;
[0053] A storage apparatus configured to store one or a plurality of programs;
[0054] When one or a plurality of programs are executed by one or a plurality of processors, the processors achieve following operation steps:
[0055] Obtaining time series data and spatial series data associated with user behavior;
[0056] According to a plurality of actual indicators values before pre-set time point in the time series data, predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point;
[0057] Comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for the user behavior;

Date recue / Date received 202 1-1 1-29
[0058] According to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining second detection result for the user behavior;
[0059] According to the first detection result and the second detection result, performing anomaly identification on the user behavior.
[0060] The fourth aspect provides a computer readable storage medium stored with a computer program configured to achieve following operation steps when the processor executes the computer program:
[0061] Obtaining time series data and spatial series data associated with user behavior;
[0062] According to a plurality of actual indicators values before pre-set time point in the time series data, predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point;
[0063] Comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for the user behavior;
[0064] According to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining second detection result for the user behavior;
[0065] According to the first detection result and the second detection result, performing anomaly identification on the user behavior.
[0066] Comparing to prior art, the technical effects achieved by the technical solutions of the present invention are:
[0067] 1. Using the SOM neural network clustering algorithm which has non-linearity, robustness and strong adaptive learning ability, also has the outstanding ability to manage uncertainty or fuzzy information, and overcome the limitation of K-Means algorithm which is influenced by the pre-determined K value and noise data, then improve the accuracy and reliability of the identification of user anomaly behavior;

Date recue / Date received 202 1-1 1-29
[0068] 2. Using the combination of ARIMA model and SOM neural network model to dig user anomaly points in both time and space aspects, comparing with single traditional method, it can improve the ability and accuracy of identifying anomaly points.
Drawing Description
[0069] In order to describe the technical solutions clearer in the implementations of the present application or the prior art, the following are drawings that need to be used are briefly introduced. Obviously, the drawings in the following description are only some implementations of the application, for those of ordinary skill in the art, without creative work, other drawings can be obtained based on these drawings.
[0070] Figure 1 is a process diagram of a user anomaly behavior identification method in the implementation of the present invention;
[0071] Figure 2 is a structural diagram of a user anomaly behavior identification apparatus in the implementation of the present invention;
[0072] Figure 3 is an internal structure of a computer device in the implementation of the present invention.
Specific implementation methods
[0073] In order to make clearer purpose, technical solutions and benefits of the present invention, the following will clearly and completely describe the technical solutions of the implementations in the present application with accompanying drawings, obviously the described implementations are only a part of the implementations in the present application. Based on the implementations in the present application, all other implementations obtained by those of ordinary skilled in the art will fall in the protection scope of the present application.
[0074] It should be noted that, unless the context clearly requires, otherwise, the similar words of "comprising", "contains" in the entire specification and claims should be interpreted as inclusive rather than Date recue / Date received 202 1-1 1-29 exclusive or exhaustive meaning; in other words, it means including but not limited to.
[0075] In addition, in the description of the present invention, the terms "first", "second", etc. are only used for descriptive purpose, they can not be understood as indicating or implying relative importance. In addition, in the description of the present invention, unless indicated, 'plurality' means two or more than two.
[0076] As described in the background, the current identification of user anomaly behavior usually adopts unsupervised K-Means clustering march learning algorithm, but K-Means algorithm needs to determine the number of classes (k) in advance, and after finding one most similar class for each input data, only updating the parameters of this class, therefore, the result of each time is unstable due to the influence of the initial value and the noise data, as a result, risky users cannot be accurately and reliably identified. For this, the implementations of the present invention provide a user anomaly behavior identification method, using the combination of ARIMA model and SOM neural network model to dig user anomaly points in both time and space aspects, comparing with single traditional method, it can improve the ability and accuracy of identifying anomaly points, meanwhile, using the SOM neural network clustering algorithm which has non-linearity, robustness and strong adaptive learning ability, also has the outstanding ability to manage uncertainty or fuzzy information, and overcome the limitation of K-Means algorithm which is influenced by the pre-determined K
value and noise data.
[0077] Implementation one
[0078] The implementation of the present invention provides a user anomaly behavior identification method, the method is applied to the user anomaly behavior identification apparatus, the apparatus can be configured in any computer device, wherein, the computer device can be server, the server can be independent server or server cluster consists of a plurality of servers.
[0079] As shown in Figure 1, the method for user anomaly behavior identification provided by the implementation of the present invention can comprise following steps:
[0080] 101, obtaining time series data and spatial series data associated with user behavior;

Date recue / Date received 202 1-1 1-29
[0081] Specifically, user data within a pre-set time period can be obtained, then pre-processing the user data to extract time series data and spatial series data associated with user behavior.
[0082] Among them, the user data comprises user attribute data and user behavior data, the user attribute data can comprise: name, age, mailing address, etc.; the user behavior data can comprise IP address of account registration, IP address of each login, time information of each login, page click information, user device information, online duration and other related information, the user device information can comprise device MAC address, device gyroscope data, device acceleration data, CPU, memory, disk I/O and other information.
[0083] Wherein, the time series data is an indicator values series obtained by sorting the actual indicator values of users in a pre-set time period in chronological order. Among them, the indicator value refers to the value of the parameter indicator obtained by statistics of user numerical data related to user behavior in a pre-set time period. Among them, the parameter indicator can be one kind of the values in online duration, device moving distance and change value of screen temperature, in addition, it can be the other indicators.
[0084] Among them, the spatial series data refers to user behavior trajectory data in a spatial order during the application, there is a connection of sequence, flow, and direction between each space, for example, the behavior trajectory data involved in the user logging in to the application to perform the transfer operation forms the user spatial series data.
[0085] 102, according to a plurality of actual indicators values before pre-set time point in the time series data, predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point.
[0086] Wherein, the pre-set time point can be the time point corresponding to the Nth data in the M data included in the time series data, N is greater than 1, N is less tan M.
[0087] Specifically, the actual values of a plurality of indicators before a pre-set time point in the time series data are substituted into the ARIMA model for predicting, obtaining the predicted values of indicators at the pre-set time point and the confidence interval of the predicted values of indicators when the confidence level is a.
Date recue / Date received 202 1-1 1-29
[0088] Among them, ARIMA (Autoregressive Integrated Moving Average Model) is Auto-Regressive Moving Average model, predicting future with past and present values, it regards the time series as a random series and finds optimal function to fit.
[0089] Wherein, ARIMA(p, q, d) model is defined as following:
Yt = (PlYt-1 (13231t-2. = = +(PpYt_p + et ¨ Otet_1¨ 92et_2... ¨Oget_q;
[0090] Wherein, p refers autoregressive order, d refers series differential order, q refers moving average order, yt is time series observation value at t time moment, et is white noise series, (pi, Oi are coefficients of yt_, and et_i respectively.
[0091] Furthermore, the ARIMA model can be constructed by following steps a to b:
[0092] a. Obtaining time series sample data associated with user behavior.
[0093] Specifically, the implementation process of this step can refer to the time series data obtaining process in step 101, here will not repeat.
[0094] b. Performing stationarity test on the time series sample data, for failing the test's time series sample data, differential processing the data to obtain stationary time series sample data.
[0095] Specifically, adopting unit root detection method to test stationarity of time series sample data to determine whether the data is stationary, if the data is non-stationary, the data needs to be stationary processed, which means that the series continue to be differentiated until the series meets the stationary test conditions, obtaining the stationary time series sample data to eliminate the data trend, the differential order d of ARIMA
model is the times of differentiating made when the time series becomes a stationary time series.
[0096] c. For the stationary time series sample data, establishing an initial ARIMA model, determining the initial ARIMA model's autoregressive order and range of moving average order according to autocorrelation coefficient and partial autocorrelation coefficient of the stationary time series sample data.

Date recue / Date received 202 1-1 1-29
[0097] d. Using AIC information criterion, determining the combination of the initial ARIMA model's optimal autoregressive order and range of moving average order, constructing the ARIMA model.
[0098] Specifically, determining the differential order d of the model, based on the AIC infoimation criterion, the ranges of both autoregressive order p and moving average order q are defined, traversing the combination of (p, q), identifying the combination of (p, q) with minimum AIC
value. In the end, the optimal p, d and q are determined to apply in ARIMA model for predicting.
[0099] 103, comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for the user behavior.
[0100] Specifically, determining whether the actual value of the indicator at the pre-set time point is within the confidence interval of the predicted indicator, obtaining the determining result, generating a first detection result for user behavior according to the determining result, wherein when the actual value of indicator falls outside the confidence interval , the first detection result used for indicating the actual value of indicator at the pre-set time point is anomaly value, when the actual value of indicator falls within the confidence interval, the first detection result used for indicating the actual value of indicator at the pre-set time point is normal value.
[0101] 104, according to the spatial series data, performing anomaly detection through pre-trained SOM
neural network model, obtaining second detection result for the user behavior.
[0102] Among them, SOM (Self Organizing Maps, self-organizing map neural network) is an unsupervised manually neural network. The network structure of SOM has two layers: input layer and output layer (also called competition layer). Usually, neural network is based on the reverse transfer of loss function to train, while the SOM uses a competitive learning strategy, relying on the competition between each neurons to gradually optimize the network, the neurons are a matrix of equidistant nodes arranged in a two-dimensional form on the neural network, to constitute the output layer; each node has correspondingly weight vector with the same dimension as the dimension length of the input data and uses the nearest neighbor relationship function to maintain the topology of input space.

Date recue / Date received 202 1-1 1-29
[0103] Among them, the SOM neural network model can be obtained by training in following methods, comprising steps Si to S6:
[0104] Si, initializing weight of each neuron in the pre-set SOM neural network;
[0105] Specifically, initializing the pre-set SOM neural network, the weight of each neuron of the SOM
neural network can be initialized to a very small random number, the random number is greater than 0 and less than 1. In addition, the number of model iterations, learning rate, and neighborhood radius also need to be initialized, for example, the iteration number can be set i = 1000, the initial learning rate rate max = 0.2, rate min = 0.05. the initial neighborhood radius zone max = 1.5, zone min =
0.8, each model parameter can make corresponding adjustments according to different data or requirements, a learning rate that is too small will reduce the speed of network optimization and increase training time, while a learning rate that is too large can cause the network parameter to swing back and forth on both sides of the final optimal value, causing the network to fail to converge. In specific implementation, at the beginning of the training of SOM neural network, selecting the value of learning rate as 0.2, and then decreasing at a faster rate, this is helpful to quickly capture the general structure of the input vector, when the learning rate is reduced to a small value, the weight of the neuron can be adjusted to conform to the sample's distribution structure of the input space. In addition, in the training process of the SOM neural network, setting a neighborhood radius R with the winning neuron as the center, the neighborhood radius R is initialized as the initial neighborhood radius, the fixed radius is called winning neighborhood. The range of the winning neighborhood shrinks as the number of training increases and finally shrinks to a fixed value of the neighborhood radius.
[0106] S2, obtaining spatial series sample data associated with behavior of sample user, normalization processing each spatial series sample data, obtaining training sample set.
[0107] Wherein, the obtaining process of spatial series sample data can refer to the obtaining of time series data in step 101, here will not repeat.
[0108] S3, randomly selecting training samples from the training sample set to be input into the SOM neural network input layer, obtaining input vector.

Date recue / Date received 202 1-1 1-29
[0109] S4, according to Euclidean distance between the input vector and each neuron in competition layer of the SOM neural network, searching for winning neuron corresponding to the input vector.
[0110] Specifically, calculating the Euclidean distance between the input vector X and each neuron, the neuron with the smallest European distance to the input vector X is the winning neuron. All neurons in the output layer of the SOM neural network compete with each other, only one wining neuron can be activated each time.
[0111] S5, using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron.
[0112] Specifically, a neighborhood radius is set with the winning neuron as the center, and the area within the radius is called winning neighborhood, according to the coordinates of winning neuron and the radius of neighborhood, determining all neurons in the winning neighborhood, and using the gradient descent method to update the weight of each neuron in the winning neighborhood.
[0113] S6, iteratively execute step S3 to S5, the training ends until reaching the pre-set end condition, obtaining the SOM neural network, and obtaining a plurality of clusters output by the SOM neural network model.
[0114] Specifically, a new input sample is read from the training sample set, and the process from step S3 to step S5 is executed iteratively, until completing the training of all training samples, after updating the weight values of all wining neurons, updating the learning rate and neighborhood function. When the number of training times of the SOM neural network reaches the pre-set maximum number of times, the training and learning process is exited, obtaining the trained SOM neural network model, and obtaining a plurality of clusters output by the SOM neural network model, wherein each cluster corresponds to a neighborhood scope (i.e., the winning neighborhood), the neighborhood contains at least one neuron.
[0115] In the present implementation, by using the SOM neural network to unearth the correlation between the various influencing factors in the spatial series data, which is more useful to the classification and research Date recue / Date received 202 1-1 1-29 of anomaly user behavior and has a high generalization ability.
[0116] Wherein, the implementation process of the above step 104 can comprise:
[0117] 1041, normalization processing the spatial series data, regarding the normalization processed spatial series data as input parameters, the parameters are input to the SOM neural network model.
[0118] 1042, according to the Euclidean distance from the input parameters to each neuron, determining the winning neurons corresponding to the input parameters and cluster of the winning neurons.
[0119] Specifically, calculating the Euclidean distance between the input vector X and each neuron, the neuron with the smallest European distance to the input vector X is the winning neuron, and determining the neighborhood to where this winning neuron belongs.
[0120] 1043, calculating cluster area of the winning neurons, comparing the cluster area with an area threshold, wherein, only when the cluster area is less than the area threshold, the cluster is an anomaly cluster.
[0121] Among them, the area threshold can be set according to the actual needs, in general, when the cluster area is small, which means that an isolated cluster with a very small cluster size is set as an anomaly cluster.
[0122] Specifically, determining the neighborhood radius of the winning neighborhood where the winning neuron is located, calculating the area of circle with the radius of the neighborhood as the radius and using it as the cluster area of the cluster to which the winning neuron belongs, comparing the cluster area with the area threshold.
[0123] 1044, according to the comparing result, generating a second detection result for user behavior.
[0124] Wherein, when the cluster area of the cluster to which the winning neuron belongs is less than the area threshold, the second detection result is used to indicate that the user spatial series data is anomaly data, when the cluster area of the cluster to which the winning neuron belongs is not less than the area threshold, the second detection result is used to indicate that the user spatial series data is normal data.
Date recue / Date received 202 1-1 1-29
[0125] What should be noted is that the implementation of the present invention does not specifically limit the order in which step 102 and step 104 are performed, the concurrent execution is the preferred solution.
[0126] 105, according to the first detection result and the second detection result, performing anomaly identification on the user behavior.
[0127] Specifically, using the following methods to identify anomaly user behavior:
[0128] If the first detection result and the second detection result are both normal, determining the user behavior as normal; if the first detection result and the second detection result are both anomaly, determining the user behavior as anomaly; if only one of the first detection result and the second detection result is normal, determining the user behavior as a suspicious anomaly behavior, the suspicious anomaly behavior can be manually identified.
[0129] Furthermore, after step 105, the method also comprises:
[0130] If the identification result of user behavior is the user anomaly behavior, then performing identification authentication on the user, or restricting the user's operations and behavior.
[0131] Wherein, the restriction operation comprises disabling the key function on the key page of the application, the key function includes but is not limited to viewing, inputting, submitting, etc.
[0132] In the present implementation, when determining that the user is a risk user, by executing the authentication of user identification or restricting accordingly operations of user, which can effectively control and prevent network security risks.
[0133] The identification of user anomaly behavior method provided by the implementation of the present invention, using the SOM neural network clustering algorithm which has non-linearity, robustness and strong adaptive learning ability, also has the outstanding ability to manage uncertainty or fuzzy information, and overcome the limitation of K-Means algorithm which is influenced by the pre-determined K value and noise Date recue / Date received 202 1-1 1-29 data, then improve the accuracy and reliability of the identification of user anomaly behavior; in addition, using the combination of ARIMA model and SOM neural network model to dig user anomaly points in both time and space aspects, comparing with single traditional method, it can improve the ability and accuracy of identifying anomaly points.
[0134] Implementation two
[0135] The identification of user anomaly behavior apparatus provided by the implementation of the present invention, as shown in Figure 2, the apparatus comprises:
[0136] A data obtaining module 202 configured to obtain time series data and spatial series data associated with user behavior;
[0137] A first detection module 204 configured to predict confidence interval of the indicator through ARIMA model when user is at the pre-set time point according to a plurality of actual indicators values before pre-set time point in the time series data; compare actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtain first detection result for the user behavior;
[0138] A second detection module 206 configured to perform anomaly detection through pre-trained SOM
neural network model according to the spatial series data and obtain second detection result for the user behavior;
[0139] An anomaly identification module 208 configured to perform anomaly identification on the user behavior according to the first detection result and the second detection result.
[0140] Furthermore, the apparatus also comprises construction module, wherein the construction module is specifically for:
[0141] Obtaining time series sample data associated with sample user behavior;
[0142] Performing stationarity test on the time series sample data, for failing the test's time series sample Date recue / Date received 202 1-1 1-29 data, differential processing the data to obtain stationary time series sample data;
[0143] For the stationary time series sample data, establishing an initial ARIMA model, determining the initial ARIMA model's autoregressive order and range of moving average order according to autocorrelation coefficient and partial autocorrelation coefficient of the stationary time series sample data;
[0144] Using AIC information criterion, determining the combination of the initial ARIMA model's optimal autoregressive order and range of moving average order, constructing the ARIMA
model.
[0145] Furthermore, the apparatus also comprises training module, wherein the training module comprises:
[0146] An initializing submodule configured to initialize weight of each neuron in the pre-set SOM neural network;
[0147] A pre-processing submodule configured to obtain spatial series sample data associated with behavior of sample user, normalization processing each spatial series sample data, obtaining training sample set;
[0148] A training submodule configured to randomly select training samples from the training sample set to be input into the SOM neural network input layer and obtain input vector, according to Euclidean distance between the input vector and each neuron in competition layer of the SOM
neural network, searching for winning neuron corresponding to the input vector, using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron;
[0149] An iteration submodule configured to iteratively execute steps in training submodule, the training ends until reaching the pre-set end condition, obtaining the SOM neural network, and obtaining a plurality of clusters output by the SOM neural network model.
[0150] Furthermore, the second detection module 206 is specifically for:
[0151] Normalization processing the spatial series data, regarding the normalization processed spatial series data as input parameters, the parameters are input to the SOM neural network model, according to the Date recue / Date received 202 1-1 1-29 Euclidean distance from the input parameters to each neuron, determining the winning neurons corresponding to the input parameters and cluster of the winning neurons;
[0152] Calculating cluster area of the winning neurons, comparing the cluster area with an area threshold, wherein, only when the cluster area is less than the area threshold, the cluster is an anomaly cluster;
[0153] According to the comparing result, generating a second detection result for user behavior.
[0154] Furthermore, the apparatus also comprises anomaly processing module, the anomaly processing module is specifically for:
[0155] If the identification result of user behavior is the user anomaly behavior, then performing identification authentication on the user, or restricting the user's operations and behavior.
[0156] The anomaly behavior identification apparatus provided by the implementation of the present invention is the same invention concept as the anomaly behavior identification method provided by the implementation of the present invention, the method for identifying anomaly user behavior provided by the implementation of the present invention can be executed which has functional modules and beneficial effects corresponding to the method for identification of anomaly user behavior. For technical details that are not described in this implementation, please refer to the method for identification of anomaly user behavior provided in the implementation of the present invention, which will not be repeated here.
[0157] Figure 3 is the internal structure diagram of the computer device provided by the implementation of the present invention. The computer device includes a processor, a memory, a network interface, and a database connected through a system bus. The processor of the computer device is configured to provide calculation and control capabilities. The memory of computer device includes non-volatile storage medium and internal memory. The memory of non-volatile storage medium has operation system, computer programs and database. The internal memory provides an environment for the operation system and computer program running in a non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer program is executed by the processor to implement a user anomaly behavior identification method.

Date recue / Date received 202 1-1 1-29
[0158] The skilled in the art can understand that the structure shown in Figure 3 is only partial structural diagram related this application solution and not constitute limitation to the computer device applied on the current application solution, the specific computer device can include more or less components than what is shown in the figure, or combinations of some components or different components to what is shown in the figure.
[0159] In an implementation, a computer device is provided which includes a memory, a processor, and a computer program stored on the memory and running on the processor. The processor achieves the following steps when executing the computer program:
[0160] Obtaining time series data and spatial series data associated with user behavior.
[0161] According to a plurality of actual indicators values before pre-set time point in the time series data, predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point.
[0162] Comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for the user behavior.
[0163] According to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining second detection result for the user behavior.
[0164] According to the first detection result and the second detection result, performing anomaly identification on the user behavior.
[0165] In an implementation, a computer readable storage medium is provided which stores with computer program, the processor achieves the following steps when executing the computer program:
[0166] Obtaining time series data and spatial series data associated with user behavior.
[0167] According to a plurality of actual indicators values before pre-set time point in the time series data, Date recue / Date received 202 1-1 1-29 predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point.
[0168] Comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for the user behavior.
[0169] According to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining second detection result for the user behavior.
[0170] According to the first detection result and the second detection result, performing anomaly identification on the user behavior.
[0171] The skilled in the art can understand that all or partial of procedures from the above-mentioned methods can be performed by computer program instructions through related hardware, the mentioned computer program can be stored in a non-volatile material computer readable storage medium, this computer can include various implementation procedures from the abovementioned methods when execution. Any reference to the memory, the storage, the database, or the other media used in each implementation provided in current application can include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), programable ROM (PROM), electrically programmable ROM
(EPRPMD), electrically erasable programmable ROM (EEPROM) or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. As an instruction but not limited to, RAM is available in many forms such as static RAM (SRAM), dynamic RAM (DRAMD), synchronous DRAM
(SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SRAM (ESDRAM), synchronal link (Synchlink) DRAM
(SLDRAM), memory bus (Rambus), direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
[0172] The above-mentioned implementations are only several implementations in this disclosure and the description is more specific and detailed but cannot be understood as the limitation of the scope of the invention patent. What should be noted is that those ordinary skilled in the art can make various modifications and variations to the disclosure without departing from the spirit and scope of the disclosure. Therefore, the scope protection of the present invention patent shall be subject to the appended claims.

Date recue / Date received 202 1-1 1-29

Claims (112)

Claims:
1. A device for identifying user anomaly behavior, the device comprising:
a data obtaining module configured to obtain a time series data and a spatial series data associated with user behavior;
a first detection module configured to:
predict a confidence interval of an indicator through an autoregressive integrated moving average model when a user is at a pre-set time point according to a plurality of actual indicator values before the pre-set time point in the time series data;
compare an actual indicator value when the user is at the pre-set time point with correspondingly the confidence interval of the indicator;
obtain a first detection result for the user behavior;
a second detection module configured to:
perform anomaly detection through a pre-trained self-organizing map neural network model according to the spatial series data;
obtain a second detection result for the user behavior; and an anomaly identification module configured to perform the anomaly identification on the user behavior according to the first detection result and the second detection result.
2. The device of claim 1 further comprises of a construction module for:
obtaining a time series sample data associated with a sample user behavior;
performing a stationarity test on the time series sample data;

Date Recue/Date Received 2024-01-11 differential processing the time series sample data that is non-stationary to obtain a stationary time series sample data;
establishing an initial autoregressive integrated moving average model for the stationary time series sample data;
determining the initial autoregressive integrated moving average model's autoregressive order and range of moving average order according to an autocorrelation coefficient and a partial autocorrelation coefficient of the stationary time series sample data;
determining the combination of the initial autoregressive integrated moving average model's optimal autoregressive order and range of moving average order using an Akaike's information criterion; and constructing the autoregressive integrated moving average model .
3. The device to claim 2 further comprises a training module comprising:
an initializing submodule configured to initialize a weight of each neuron in the pre-trained self-organizing map neural network model;
a pre-processing submodule configured to:
obtain a spatial series sample data associated with behavior of a sample user, normalization process each of the spatial series sample data; and obtain a training sample set;
a training submodule configured to:
randomly select training samples from the training sample set to be input into a self-organizing map neural network input layer;

Date Recue/Date Received 2024-01-11 obtain an input vector, according to an Euclidean distance between the input vector and each neuron in a competition layer of a self-organizing map neural network;
searching for a winning neuron corresponding to the input vector, using a gradient descent method; and performing a weight update on the winning neuron and each neuron of neurons set around the winning neuron;
and an iteration submodule configured to:
iteratively execute steps in the training submodule, wherein the training ends until reaching a pre-set end condition;
obtaining the self-organizing map neural network; and obtaining a plurality of clusters output by the self-organizing map neural network model.
4. The device of claim 3, wherein the second detection module further configured for:
normalization processing the spatial series data, wherein the normalization processed spatial series data as input parameters, wherein the parameters are input to the self-organizing map neural network, according to the Euclidean distance from the input parameters to each neuron:
determining a plurality of winning neurons corresponding to the input parameters and cluster of the plurality of winning neurons;
calculating a cluster area of the plurality of winning neurons;
comparing a cluster area with an area threshold, wherein when the cluster area is less than the area threshold, the cluster is an anomaly cluster; and Date Recue/Date Received 2024-01-11 generating the second detection result for the user behavior according to the comparing result.
5. The device of claim 4, further comprises an anomaly processing module for performing an identification authentication on the user, or restricting the user's operations and behavior, if an identification result of the user behavior is the user anomaly behavior.
6. The device of claim 5, wherein a user data within a pre-set time period can be obtained, pre-processing the user data to extract the time series data and the spatial series data associated with the user behavior.
7. The device of claim 6, wherein the user data includes a user attribute data and a user behavior data, wherein the user attribute data includes any one of more of name, age, mailing address, wherein the user behavior data includes any one of more of IP address of account registration, IP address of each login, time information of each login, page click information, user device information, online duration and other related information, wherein the user device information includes any one of more of device MAC address, a device gyroscope data, a device acceleration data, CPU, memory, and disk I/0 information.
8. The device of any one of claims 6 to 7, wherein the time series data is an indicator values series obtained by sorting the plurality of actual indicator values of users in the pre-set time period in chronological order, wherein the actual indicator value refers to the value of parameter in the indicator obtained by statistics of a user numerical data related to the user behavior in the pre-set time period, wherein the parameter in the indicator can be any one or more of values in online duration, device moving distance and change value of screen temperature.
9. The device of claim 8, wherein the spatial series data refers to a user behavior trajectory data in a spatial order, wherein there is a connection of sequence, flow, and direction between each space, wherein the user behavior trajectory data involved in the user logging in to an application to perform a transfer operation foims a user spatial series data.
Date Recue/Date Received 2024-01-11
10. The device of claim 9, wherein the pre-set time point can be a time point corresponding to a Nth data in a M data included in the time series data, N is greater than 1, N
is less than M.
11. The device of claim 10, wherein the plurality of actual indicator values before the pre-set time point in the time series data are substituted into the autoregressive integrated moving average model for predicting, wherein obtaining a plurality of predicted values of indicators at the pre-set time point and the confidence interval of the plurality of predicted values of indicators when a confidence level is a.
12. The device of claim 11, wherein the autoregressive integrated moving average model is an Auto-Regressive Moving Average model, predicting future with past and present values, wherein it regards the time series data as a random series and finds optimal function to fit.
13. The device of claim 12, wherein, the autoregressive integrated moving average model (p, q, d) is defined as the following:
Yr = (PlYt-t + (P2Yr-2. = = +(PpYt-p + et ¨ Otet-1¨ O2et-2. = = ¨thlet-q =
14. The device of claim 13, wherein p refers to the autoregressive order, d refers to an series differential order, q refers moving average order, yt is time series observation value at t time moment, et is white noise series, cpi,Cii are coefficients of yt_i and et_i respectively.
15. The device of claim 14, wherein a unit root detection method is adopted to test stationarity of a time series sample data to determine whether the time series sample data is stationary, wherein the times series sample data is non-stationary, the times series sample data is to be stationary processed, wherein the series continue to be differentiated until the series meets stationary test conditions, wherein obtaining a stationary time series sample data to eliminate a data trend, wherein a differential order d of the autoregressive integrated moving average model is the times of differentiating made when the time series becomes a stationary time series.

Date Recue/Date Received 2024-01-11
16. The device of claim 15, wherein determining the differential order d of the model, based on the Akaike's information criterion, the ranges of both the autoregressive order p and moving average order q are defined, traversing the combination of (p, q), identifying the combination of (p, q) with a minimum Akaike's information criterion value, wherein, the optimal p, d and q are determined to apply in the autoregressive integrated moving average model for predicting.
17. The device of claim 16, wherein determining whether the actual indicator value at the pre-set time point is within the confidence interval of a predicted indicator value, obtaining a determining result;
generating the first detection result for user behavior according to the determining result, wherein when the actual indicator value falls outside the confidence interval , wherein the first detection result used for indicating the actual indicator value at the pre-set time point is an anomaly value, wherein when the actual indicator value falls within the confidence interval, the first detection result used for indicating the actual indicator value at the pre-set time point is a normal value.
18. The device of claim 17, wherein the self-organizing map neural network comprises an unsupervised neural network, and wherein network structure of the self-organizing map neural network includes two layers, the two layers including input layer and an output layer, and wherein the output layer is a competition layer, and wherein the unsupervised neural network comprises a reverse transfer of loss function training, and wherein the neurons comprise a matrix of equidistant nodes arranged in a two-dimensional form on the self-organizing map neural network, the matrix of equidistant nodes comprising the output layer, and wherein each node includes corresponding weight vector comprising the same dimension as the dimension length of an input data and performs by means of the nearest neighbor relationship function to maintain the topology of input space.

Date Recue/Date Received 2024-01-11
19. The device of claim 18, wherein initializing the pre-trained self-organizing map neural network model comprises, wherein the weight of each neuron of the self-organizing map neural network can be initialized to a very small random number, the random number is greater than 0 and less than 1, wherein number of model iterations, learning rate, and neighborhood radius also need to be initialized.
20. The device of claim 19, wherein the training process of the self-organizing map neural network comprises, setting a neighborhood radius R with the winning neuron as a center, the neighborhood radius R is initialized as a initial neighborhood radius, a fixed radius is called a winning neighborhood, wherein the range of the winning neighborhood shrinks as the number of training increases and finally shrinks to a fixed value of the neighborhood radius.
21. The device of claim 20, wherein calculating the Euclidean distance between an input vector X and each neuron, the neuron with the smallest European distance to the input vector X is the winning neuron, wherein all neurons in the output layer of the self-organizing map neural network compete with each other, only one wining neuron can be activated each time.
22. The device of claim 20, wherein the neighborhood radius is set with the winning neuron as the center, and the area within the radius is called the winning neighborhood, according to the coordinates of the winning neuron and the radius of neighborhood, determining all neurons in the winning neighborhood, and using the gradient descent method to update the weight of each neuron in the winning neighborhood.
23. The device of claim 22, wherein a new input sample is read from a training sample set, and the process is executed iteratively, until completing the training of all training samples, after updating the weight values of ail wining neurons, updating learning rate and neighborhood function, wherein the number of training times of the self-organizing map neural network reaches a pre-set maximum number of times, the training and learning process is exited, obtaining the pre-trained self-organizing map neural network model, and obtaining a plurality of clusters output by the self-organizing map neural network, wherein each cluster corresponds to a neighborhood scope, wherein the neighborhood scope corresponds to the winning neighborhood, the neighborhood contains at least one neuron.

Date Recue/Date Received 2024-01-11
24. The device of claim 23, wherein the area threshold can be set according to a requirement of a small cluster area, wherein the small cluster area comprises an isolated cluster with a very small cluster size is set as the anomaly cluster.
25. The device of claim 24, wherein determining the neielborhood radius of a winning neighborhood where the winning neuron is located, calculating the area of circle with the radius of the neighborhood as the radius and using it as the cluster area of the cluster to which the winning neuron belongs, comparing the cluster area with the area threshold.
26. The device of claim 25, wherein when the cluster area of the cluster to which the winning neuron belongs is less than the area threshold, the second detection result is used to indicate that the user spatial series data is an anomaly data, when the cluster area of the cluster to which the winning neuron belongs is not less than the area threshold, the second detection result is used to indicate that the user spatial series data is a normal data.
27. The device of claim 26, wherein the order in which predicting the confidence interval of the indicator through the autoregressive integrated moving average model and performing anomaly detection through the pre-trained self-organizing map neural network model are executed concurrently.
28. The device of claim 27, wherein the first detection result and the second detection result are both normal, determining the user behavior as normal, wherein the first detection result and the second detection result are both anomaly, determining the user behavior as anomaly, wherein only one of the first detection result and the second detection result is normal, determining the user behavior as a suspicious anomaly behavior, the suspicious anomaly behavior can be manually identified.
29. A computer device comprising:
one or a plurality of processors;
a storage apparatus configured to store one or a plurality of programs;

Date Recue/Date Received 2024-01-11 a network interface;
a database connected through a system bus;
wherein one or a plurality of programs are executed by one or a plurality of processors, one or a plurality of processors configured to:
obtain a time series data and a spatial series data associated with user behavior;
predict a confidence interval of an indicator through an autoregressive integrated moving average model when a user is at a pre-set time point, according to a plurality of actual indicator values before the pre-set time point in the time series data;
compare an actual indicator value when the user is at the pre-set time point with correspondingly the confidence interval of the indicator;
obtain a first detection result for the user behavior;
perform anomaly detection through a pre-trained self-organizing map neural network model according to the spatial series data, obtain a second detection result for the user behavior; and perform an anomaly identification on the user behavior according to the first detection result and the second detection result;
and wherein the network interface of the computer device is used to communicate with an external terminal through a network connection.
30. The device of claim 29, wherein the autoregressive integrated moving average model is configured by:
obtaining a time series sample data associated with a sample user behavior;
Date Recue/Date Received 2024-01-11 performing a stationarity test on the time series sample data;
differential processing the time series sample data that is non-stationary to obtain a stationary time series sample data;
establishing an initial autoregressive integrated moving average model for the stationary time series sample data;
determining the initial autoregressive integrated moving average model's autoregressive order and range of moving average order according to an autocorrelation coefficient and a partial autocorrelation coefficient of the stationary time series sample data;
determining the combination of the initial autoregressive integrated moving average model's optimal autoregressive order and range of moving average order using an Akaike's information criterion; and constructing the autoregressive integrated moving average model .
31. The device of claim 30, wherein a self-organizing map neural network is trained by:
initializing a weight of each neuron in the pre-trained self-organizing map neural network model;
obtaining a spatial series sample data associated with behavior of a sample user;
normalization processing each of the spatial series sample data;
obtaining a training sample set;
randomly selecting training samples from the training sample set to be input into a self-organizing map neural network input layer;
obtaining an input vector;

Date Recue/Date Received 2024-01-11 searching for a winning neuron corresponding to the input vector according to an Euclidean distance between the input vector and each neuron in a competition layer of the self-organizing map neural network;
using a gradient descent method, performing a weight update on the winning neuron and each neuron of neurons set around the winning neuron; and iteratively execute:
randomly selecting training samples from the training sample set to be input into the self-organizing map neural network input layer;
obtaining the input vector;
searching for the winning neuron corresponding to the input vector according to the Euclidean distance between the input vector and each neuron in the competition layer of the self-organizing map neural network;
using the gradient descent method, performing the weight update on the winning neuron and each neuron of neurons set around the winning neuron;
and wherein the training ends until reaching a pre-set end condition, obtaining the self-organizing map neural network, and obtaining a plurality of clusters output by the self-organizing map neural network.
32. The device of claim 31, wherein according to the spatial series data, performing anomaly detection through the pre-trained self-organizing map neural network model, obtaining the second detection result for the user behavior, comprising:
normalization processing the spatial series data, wherein the normalization processed spatial series data as input parameters, wherein the parameters are input to the self-organizing map neural network;

Date Recue/Date Received 2024-01-11 determining a plurality of winning neurons corresponding to the input parameters and cluster of the plurality of winning neurons according to the Euclidean distance from the input parameters to each neuron;
calculating a cluster area of the plurality of winning neurons;
comparing a cluster area with an area threshold, wherein the cluster area is less than the area threshold, the cluster is an anomaly cluster; and generating the second detection result for user behavior according to the comparing result.
33. The device of claim 32, wherein the first detection result and the second detection result is the anomaly identification of user behavior, perform an identification authentication on the user, or restricting the user's operations and behavior, wherein the restriction operation comprises disabling a key function on a key page of an application, wherein the key function includes viewing, inputting, submitting.
34. The device of claim 33, wherein a user data within a pre-set time period can be obtained, pre-processing the user data to extract the time series data and the spatial series data associated with the user behavior.
35. The device of claim 34, wherein the user data includes a user attribute data and a user behavior data, wherein the user attribute data includes any one of more of name, age, mailing address, wherein the user behavior data includes any one of more of IP address of account registration, IP address of each login, time information of each login, page click information, user device information, online duration and other related information, wherein the user device information includes any one of more of device MAC address, a device gyroscope data, a device acceleration data, CPU, memory, and disk I/0 information.

Date Recue/Date Received 2024-01-11
36. The device of any one of claims 34 to 35, wherein the time series data is an indicator values series obtained by sorting the plurality of actual indicator values of users in the pre-set time period in chronological order, wherein the actual indicator value refers to the value of parameter in the indicator obtained by statistics of a user numerical data related to the user behavior in the pre-set time period, wherein the parameter in the indicator can be any one or more of values in online duration, device moving distance and change value of screen temperature.
37. The device of claim 36, wherein the spatial series data refers to a user behavior trajectory data in a spatial order, wherein there is a connection of sequence, flow, and direction between each space, wherein the user behavior trajectory data involved in the user logging in to the application to perform a transfer operation forms a user spatial series data.
38. The device of claim 37, wherein the pre-set time point can be a time point corresponding to a Nth data in a M data included in the time series data, N is greater than 1, N
is less than M.
39. The device of claim 38, wherein the plurality of actual indicator values before the pre-set time point in the time series data are substituted into the autoregressive integrated moving average model for predicting, wherein obtaining a plurality of predicted values of indicators at the pre-set time point and the confidence interval of the plurality of predicted values of indicators when a confidence level is a.
40. The device of claim 39, wherein the autoregressive integrated moving average model is an Auto-Regressive Moving Average model, predicting future with past and present values, wherein it regards the time series data as a random series and finds optimal function to fit.
41. The device of claim 40, wherein, the autoregressive integrated moving average model (p, q, d) is defined as the following:
Yt = + T2Yt-2. = = +(PpYt-p + et Otet-1¨ 02et-2===¨Oget-q=

Date Recue/Date Received 2024-01-11
42. The device claim 41, wherein p refers to the autoregressive order, d refers to an series differential order, q refers moving average order, yt is time series observation value at t time moment, et is white noise series, cpt. Ot are coefficients of yt_i and et_i respectively.
43. The device of claim 42, wherein a unit root detection method is adopted to test stationarity of a time series sample data to detennine whether the time series sample data is stationary, wherein the time series sample data is non-stationary, the time series sample data is to be stationary processed, wherein the series continue to be differentiated until the series meets stationary test conditions, wherein obtaining a stationary time series sample data to eliminate a data trend, wherein a differential order d of the autoregressive integrated moving average model is the times of differentiating made when the time series becomes a stationary time series.
44. The device of claim 43, wherein detelinining the differential order d of the model, based on the Akaike's information criterion, the ranges of both the autoregressive order p and moving average order q are defined, and wherein traversing the combination of (p, q), identifying the combination of (p, q) with a minimum Akaike's information criterion value, and wherein the optimal p, d and q are determined to apply in the autoregressive integrated moving average model for predicting.
45. The device of claim 44, wherein determining whether the actual indicator value at the pre-set time point is within the confidence interval of a predicted indicator value, obtaining a determining result;
generating the first detection result for user behavior according to the determining result, wherein when the actual indicator value falls outside the confidence interval , wherein the first detection result used for indicating the actual indicator value at the pre-set time point is an anomaly value, wherein when the actual indicator value falls within the confidence interval, the first detection result used for indicating the actual indicator value at the pre-set time point is a normal value.
Date Recue/Date Received 2024-01-11
46. The device of claim 45, wherein the self-organizing map neural network comprises an unsupervised neural network, and wherein network structure of the self-organizing map neural network includes two layers, the two layers including input layer and output layer, and wherein the output layer is a competition layer, and wherein the unsupervised neural network comprises a reverse transfer of loss function training, and wherein the neurons comprise a matrix of equidistant nodes arranged in a two-dimensional form on the self-organizing map neural network, the matrix of equidistant nodes comprising the output layer, and wherein each node includes corresponding weight vector comprising the same dimension as the dimension length of an input data and performs by means of the nearest neighbor relationship function to maintain the topology of input space.
47. The device of claim 46, wherein initializing the pre-trained self-organizing map neural network model comprises, wherein the weight of each neuron of the self-organizing map neural network can be initialized to a very small random number, the random number is greater than 0 and less than 1, wherein number of model iterations, learning rate, and neighborhood radius also need to be initialized.
48. The device of claim 46, wherein the training process of the self-organizing map neural network comprises, setting a neighborhood radius R with the winning neuron as a center, the neighborhood radius R is initialized as an initial neighborhood radius, a fixed radius is called a winning neighborhood, wherein the range of the winning neighborhood shrinks as the number of training increases and finally shrinks to a fixed value of the neighborhood radius.
49. The device of claim 48, wherein calculating the Euclidean distance between an input vector X and each neuron, the neuron with the smallest European distance to the input vector X is the winning neuron, wherein all neurons in the output layer of the self-organizing map neural network compete with each other, only one wining neuron can be activated each time.

Date Recue/Date Received 2024-01-11
50. The device of claim 48, wherein the neighborhood radius is set with the winning neuron as the center, and the area within the radius is called the winning neighborhood, according to the coordinates of the winning neuron and the radius of neighborhood, determining all neurons in the winning neighborhood, and using the gradient descent method to update the weight of each neuron in the winning neighborhood.
51. The device of claim 50, wherein a new input sample is read from a training sample set, and the process is executed iteratively, until completing the training of all training samples, after updating the weight values of all wining neurons, updating learning rate and neighborhood function, wherein the number of training times of the self-organizing map neural network reaches a pre-set maximum number of times, the training and learning process is exited, obtaining the pre-trained self-organizing map neural network model, and obtaining a plurality of clusters output by the self-organizing map neural network, wherein each cluster corresponds to a neighborhood scope, wherein the neighborhood scope corresponds to the winning neighborhood, the neighborhood contains at least one neuron.
52. The device of claim 51, wherein the area threshold can be set according to a requirement of a small cluster area, wherein the small cluster area comprises an isolated cluster with a very small cluster size is set as the anomaly cluster.
53. The device of claim 52, wherein determining the neighborhood radius of a winning neighborhood where the winning neuron is located, calculating the area of circle with the radius of the neighborhood as the radius and using it as the cluster area of the cluster to which the winning neuron belongs, comparing the cluster area with the area threshold.
54. The device of claim 53, wherein when the cluster area of the cluster to which the winning neuron belongs is less than the area threshold, the second detection result is used to indicate that the user spatial series data is an anomaly data, when the cluster area of the cluster to which the winning neuron belongs is not less than the area threshold, the second detection result is used to indicate that the user spatial series data is a normal data.

Date Recue/Date Received 2024-01-11
55. The device of claim 54, wherein the order in which predicting the confidence interval of the indicator through the autoregressive integrated moving average model and performing anomaly detection through the pre-trained self-organizing map neural network model are executed concurrently.
56. The device of claim 55, wherein the first detection result and the second detection result are both normal, determining the user behavior as normal, wherein the first detection result and the second detection result are both anomaly, determining the user behavior as anomaly, wherein only one of the first detection result and the second detection result is normal, determining the user behavior as a suspicious anomaly behavior, the suspicious anomaly behavior can be manually identified.
57. A computer readable physical memory having stored thereon a computer program executed by a computer configured to:
obtain a time series data and a spatial series data associated with user behavior;
predict a confidence interval of an indicator through an autoregressive integrated moving average model when a user is at a pre-set time point, according to a plurality of actual indicator values before the pre-set time point in the time series data;
compare an actual indicator value when the user is at the pre-set time point with correspondingly the confidence interval of the indicator, obtaining a first detection result for the user behavior;
obtain a second detection result for the user behavior according to the spatial series data, performing anomaly detection through a pre-trained self-organizing map neural network model; and perform an anomaly identification on the user behavior, according to the first detection result and the second detection result.

Date Recue/Date Received 2024-01-11
58. The memory of claim 57, wherein the autoregressive integrated moving average model is configured by:
obtaining a time series sample data associated with a sample user behavior;
performing a stationarity test on the time series sample data;
differential processing the time series sample data that is non-stationary to obtain a stationary time series sample data;
establishing an initial autoregressive integrated moving average model for the stationary time series sample data;
determining the initial autoregressive integrated moving average model's autoregressive order and range of moving average order according to an autocorrelation coefficient and a partial autocorrelation coefficient of the stationary time series sample data;
determining the combination of the initial autoregressive integrated moving average model's optimal autoregressive order and range of moving average order using an Akaike's information criterion; and constructing the autoregressive integrated moving average model .
59. The memory of claim 58, wherein a self-organizing map neural network is trained by:
initializing a weight of each neuron in the pre-trained self-organizing map neural network model;
obtaining a spatial series sample data associated with behavior of a sample user;
normalization processing each of the spatial series sample data;
obtaining a training sample set;
randomly selecting training samples from the training sample set to be input into a self-organizing map neural network input layer;

Date Recue/Date Received 2024-01-11 obtaining an input vector;
searching for a winning neuron corresponding to the input vector according to an Euclidean distance between the input vector and each neuron in a competition layer of a self-organizing map neural network;
using a gradient descent method, performing a weight update on the winning neuron and each neuron of neurons set around the winning neuron; and iteratively execute:
randomly selecting training samples from the training sample set to be input into the self-organizing map neural network input layer;
obtaining the input vector;
searching for the winning neuron corresponding to the input vector according to the Euclidean distance between the input vector and each neuron in the competition layer of the self-organizing map neural network;
using the gradient descent method, performing the weight update on the winning neuron and each neuron of neurons set around the winning neuron;
and wherein the training ends until reaching a pre-set end condition, obtaining the self-organizing map neural network, and obtaining a plurality of clusters output by the self-organizing map neural network.
60. The memory of claim 59, wherein according to the spatial series data, performing anomaly detection through the pre-trained self-organizing map neural network model, obtaining the second detection result for the user behavior, comprising:
normalization processing the spatial series data, wherein the normalization processed spatial series data as input parameters, wherein the parameters are input to the self-organizing map neural network;
Date Recue/Date Received 2024-01-11 determining the winning neurons corresponding to the input parameters and cluster of the winning neurons according to the Euclidean distance from the input parameters to each neuron;
calculating a cluster area of the winning neurons;
comparing a cluster area with an area threshold, wherein the cluster area is less than the area threshold, the cluster is an anomaly cluster; and generating the second detection result for user behavior according to the comparing result.
61. The memory of claim 60, wherein the first detection result and the second detection result is the anomaly identification of user behavior, perform an identification authentication on the user, or restricting the user's operations and behavior, wherein the restriction operation comprises disabling a key function on a key page of an application, wherein the key function includes viewing, inputting, submitting.
62. The memory of claim 61, wherein a user data within a pre-set time period can be obtained, pre-processing the user data to extract the time series data and spatial series data associated with the user behavior.
63. The memory of claim 62, wherein the user data includes a user attribute data and a user behavior data, wherein the user attribute data includes any one of more of name, age, mailing address, wherein the user behavior data includes any one of more of IP address of account registration, IP address of each login, time information of each login, page click information, user device information, online duration and other related information, wherein the user device information includes any one of more of device MAC address, a device gyroscope data, a device acceleration data, CPU, memory, and disk I/0 information.

Date Recue/Date Received 2024-01-11
64. The memory of claim 62, wherein the time series data is an indicator values series obtained by sorting the plurality of actual indicator values of users in the pre-set time period in chronological order, wherein the actual indicator value refers to the value of parameter in the indicator obtained by statistics of a user numerical data related to the user behavior in the pre-set time period, wherein the parameter in the indicator can be any one or more of values in online duration, device moving distance and change value of screen temperature.
65. The memory of claim 64, wherein the spatial series data refers to a user behavior trajectory data in a spatial order, wherein there is a connection of sequence, flow, and direction between each space, wherein the user behavior trajectory data involved in the user logging in to the application to perform a transfer operation forms a user spatial series data.
66. The memory of claim 65, wherein the pre-set time point can be a time point corresponding to a Nth data in a M data included in the time series data, N is greater than 1, N is less tan M.
67. The memory of claim 66, wherein the plurality of actual indicator values before the pre-set time point in the time series data are substituted into the autoregressive integrated moving average model for predicting, wherein obtaining a plurality of predicted values of indicators at the pre-set time point and the confidence interval of the plurality of predicted values of indicators when a confidence level is a.
68. The memory of claim 67, wherein the autoregressive integrated moving average model is an Auto-Regressive Moving Average model, predicting future with past and present values, wherein it regards the time series data as a random series and finds optimal function to fit.
69. The memory of claim 68, wherein, the autoregressive integrated moving average model (p, q, d) is defined as the following:
Yt = (PlYt-1 (P2Yt-2"=+(ppyt_p + et ¨ Otet_1¨ 02et_2...¨Oget_q.

Date Recue/Date Received 2024-01-11
70. The memory claim 69, wherein p refers to the autoregressive order, d refers to an series differential order, q refers moving average order, yt is time series observation value at t time moment, et is white noise series, cpt. Ot are coefficients of yt_i and et_i respectively.
71. The memory of claim 70, wherein a unit root detection method is adopted to test stationarity of the time series sample data to determine whether the time series sample data is stationary, wherein the time series sample data is non-stationary, the time series sample data is to be stationary processed, wherein the series continue to be differentiated until the series meets stationary test conditions, wherein obtaining a stationary time series sample data to eliminate a data trend, wherein the differential order d of the autoregressive integrated moving average model is the times of differentiating made when the time series becomes a stationary time series.
72. The memory of claim 71, wherein determining the differential order d of the model, based on the Akaike's information criterion, the ranges of both the autoregressive order p and moving average order q are defined, and wherein traversing the combination of (p, q), identifying the combination of (p, q) with a minimum Akaike's information criterion value, and wherein the optimal p, d and q are determined to apply in the autoregressive integrated moving average model for predicting.
73. The memory of claim 72, wherein determining whether the actual indicator value at the pre-set time point is within the confidence interval of a predicted indicator value, obtaining a determining result;
generating the first detection result for user behavior according to the determining result, wherein when the actual indicator value falls outside the confidence interval , wherein the first detection result used for indicating the actual indicator value at the pre-set time point is an anomaly value, wherein when the actual indicator value falls within the confidence interval, the first detection result used for indicating the actual indicator value at the pre-set time point is a normal value.

Date Recue/Date Received 2024-01-11
74. The memory of claim 73, wherein the self-organizing map neural network comprises an unsupervised neural network, and wherein network structure of the self-organizing map neural network includes two layers, the two layers including input layer and output layer, and wherein the output layer a competition layer, and wherein the unsupervised neural network comprises a reverse transfer of loss function training, and wherein the neurons comprise a matrix of equidistant nodes arranged in a two-dimensional form on the self-organizing map neural network, the matrix of equidistant nodes comprising the output layer, wherein each node has correspondingly weight vector with the same dimension as the dimension length of an input data and performs by means of the nearest neighbor relationship function to maintain the topology of input space.
75. The memory of claim 74, wherein initializing the pre-trained self-organizing map neural network model comprises, wherein the weight of each neuron of the self-organizing map neural network can be initialized to a very small random number, the random number is greater than 0 and less than 1, wherein number of model iterations, learning rate, and neighborhood radius also need to be initialized.
76. The memory of claim 75, wherein the training process of the self-organizing map neural network comprises, setting a neighborhood radius R with the winning neuron as a center, the neighborhood radius R is initialized as a initial neighborhood radius, a fixed radius is called a winning neighborhood, wherein the range of the winning neighborhood shrinks as the number of training increases and finally shrinks to a fixed value of the neighborhood radius.
77. The memory of claim 76, wherein calculating the Euclidean distance between an input vector X and each neuron, the neuron with the smallest European distance to the input vector X is the winning neuron, wherein all neurons in the output layer of the self-organizing map neural network compete with each other, only one wining neuron can be activated each time.

Date Recue/Date Received 2024-01-11
78. The memory of claim 76, wherein the neighborhood radius is set with the winning neuron as the center, and the area within the radius is called the winning neighborhood, according to the coordinates of the winning neuron and the radius of neighborhood, determining all neurons in the winning neighborhood, and using the gradient descent method to update the weight of each neuron in the winning neighborhood.
79. The memory of claim 78, wherein a new input sample is read from a training sample set, and the process is executed iteratively, until completing the training of all training samples, after updating the weight values of all wining neurons, updating learning rate and neighborhood function, wherein the number of training times of the self-organizing map neural network reaches a pre-set maximum number of times, the training and learning process is exited, obtaining the pre-trained self-organizing map neural network model, and obtaining a plurality of clusters output by the self-organizing map neural network , wherein each cluster corresponds to a neighborhood scope, wherein the neighborhood scope corresponds to the winning neighborhood, the neighborhood contains at least one neuron.
80. The memory of claim 79, wherein the area threshold can be set according to a requirement of a small cluster area, wherein the small cluster area comprises an isolated cluster with a very small cluster size is set as the anomaly cluster.
81. The memory of claim 80, wherein determining the neighborhood radius of a winning neighborhood where the winning neuron is located, calculating the area of circle with the radius of the neighborhood as the radius and using it as the cluster area of the cluster to which the winning neuron belongs, comparing the cluster area with the area threshold.
82. The memory of claim 81, wherein when the cluster area of the cluster to which the winning neuron belongs is less than the area threshold, the second detection result is used to indicate that the user spatial series data is an anomaly data, when the cluster area of the cluster to which the winning neuron belongs is not less than the area threshold, the second detection result is used to indicate that the user spatial series data is a normal data.
Date Recue/Date Received 2024-01-11
83. The memory of claim 82, wherein the order in which predicting the confidence interval of the indicator through the autoregressive integrated moving average model and performing anomaly detection through the pre-trained self-organizing map neural network model are executed concurrently.
84. The memory of claim 83, wherein the first detection result and the second detection result are both normal, determining the user behavior as normal, wherein the first detection result and the second detection result are both anomaly, determining the user behavior as anomaly, wherein only one of the first detection result and the second detection result is normal, determining the user behavior as a suspicious anomaly behavior, the suspicious anomaly behavior can be manually identified.
85. An identification method for user anomaly behavior, the method comprises:
obtaining a time series data and a spatial series data associated with user behavior;
predicting a confidence interval of an indicator through an autoregressive integrated moving average model when a user is at a pre-set time point according to a plurality of actual indicator values before the pre-set time point in the time series data;
comparing an actual indicator value when the user is at the pre-set time point with correspondingly the confidence interval of the indicator;
obtaining a first detection result for the user behavior;
performing anomaly detection through a pre-trained self-organizing map neural network model according to the spatial series data;
obtaining a second detection result for the user behavior; and performing an anomaly idennficati on on the user behavior according to the first detection result and the second detection result.

Date Recue/Date Received 2024-01-11
86. The method of claim 85, wherein the autoregressive integrated moving average model is configured by:
obtaining a time series sample data associated with a sample user behavior;
performing a stationarity test on the time series sample data;
differential processing the time series sample data that is non-stationary to obtain a stationary time series sample data;
establishing an initial autoregressive integrated moving average model for the stationary time series sample data;
determining the initial autoregressive integrated moving average model's autoregressive order and range of moving average order according to an autocorrelation coefficient and a partial autocorrelation coefficient of the stationary time series sample data;
determining the combination of the initial autoregressive integrated moving average model's optimal autoregressive order and range of moving average order using an Akaike's information criterion; and constructing the autoregressive integrated moving average model .
87. The method of claim 86, wherein a self-organizing map neural network is trained by:
initializing a weight of each neuron in the pre-trained self-organizing map neural network model;
obtaining a spatial series sample data associated with behavior of a sample user;
normalization processing each of the spatial series sample data;
obtaining a training sample set;
randomly selecting training samples from the training sample set to be input into a self-organizing map neural network input layer;

Date Recue/Date Received 2024-01-11 obtaining an input vector;
searching for a winning neuron corresponding to the input vector according to an Euclidean distance between the input vector and each neuron in a competition layer of the self-organizing map neural network;
using a gradient descent method, performing a weight update on the winning neuron and each neuron of neurons set around the winning neuron; and iteratively execute:
randomly selecting training samples from the training sample set to be input into the self-organizing map neural network input layer;
obtaining the input vector;
searching for the winning neuron corresponding to the input vector according to the Euclidean distance between the input vector and each neuron in the competition layer of the self-organizing map neural network;
using the gradient descent method, performing the weight update on the winning neuron and each neuron of neurons set around the winning neuron;
and wherein the training ends until reaching a pre-set end condition, obtaining the self-organizing map neural network, and obtaining a plurality of clusters output by the self-organizing map neural network.
88. The method of claim 87, wherein according to the spatial series data, performing anomaly detection through the pre-trained self-organizing map neural network model, obtaining the second detection result for the user behavior, comprising:
normalization processing the spatial series data, wherein the normalization processed spatial series data as input parameters, wherein the parameters are input to the self-organizing map neural network;

Date Recue/Date Received 2024-01-11 determining the winning neurons corresponding to the input parameters and cluster of the winning neurons according to the Euclidean distance from the input parameters to each neuron;
calculating a cluster area of the winning neurons;
comparing the cluster area with an area threshold, wherein a cluster area is less than the area threshold, the cluster is an anomaly cluster; and generating the second detection result for user behavior according to the comparing result.
89. The method of claim 88, wherein the first detection result and the second detection result is the anomaly identification of user behavior, perform an identification authentication on the user, or restricting the user's operations and behavior, wherein the restriction operation comprises disabling a key function on a key page of an application, wherein the key function includes viewing, inputting, submitting.
90. The method of claim 89, wherein a user data within a pre-set time period can be obtained, pre-processing the user data to extract the time series data and spatial series data associated with the user behavior.
91. The method of claim 90, wherein the user data includes a user attribute data and a user behavior data, wherein the user attribute data includes any one of more of name, age, mailing address, wherein the user behavior data includes any one of more of IP address of account registration, IP address of each login, time information of each login, page click information, user device information, online duration and other related information, wherein the user device information includes any one of more of device MAC address, a device gyroscope data, a device acceleration data, CPU, memory, and disk I/0 information.

Date Recue/Date Received 2024-01-11
92. The method of claim 90, wherein the time series data is an indicator values series obtained by sorting the plurality of actual indicator values of users in the pre-set time period in chronological order, wherein the actual indicator value refers to the value of parameter in the indicator obtained by statistics of user numerical data related to the user behavior in the pre-set time period, wherein the parameter in the indicator can be any one or more of values in online duration, device moving distance and change value of screen temperature.
93. The method of claim 92, wherein the spatial series data refers to a user behavior trajectory data in a spatial order, wherein there is a connection of sequence, flow, and direction between each space, wherein the user behavior trajectory data involved in the user logging in to the application to perform a transfer operation forms a user spatial series data.
94. The method of claim 93, wherein the pre-set time point can be a time point corresponding to a Nth data in a M data included in the time series data, N is greater than 1, N is less than M.
95. The method of claim 94, wherein the plurality of actual indicator values before the pre-set time point in the time series data are substituted into the autoregressive integrated moving average model for predicting, wherein obtaining a plurality of predicted values of indicators at the pre-set time point and the confidence interval of the plurality of predicted values of indicators when a confidence level is a.
96. The method of claim 95, wherein the autoregressive integrated moving average model is an Auto-Regressive Moving Average model, predicting future with past and present values, wherein it regards the time series data as a random series and finds optimal function to fit.
97. The method of claim 96, wherein, the autoregressive integrated moving average model (p, q, d) is defined as the following:
Yt = + (P2Yt-2= = = +(PpYt-p + et Otet-1¨ 02et-2===¨Oget-q=
Date Recue/Date Received 2024-01-11
98. The method claim 97, wherein p refers to the autoregressive order, d refers to an series differential order, q refers moving average order, yt is time series observation value at t time moment, et is white noise series, cpt. Ot are coefficients of yt_i and et_i respectively.
99. The method of claim 98, wherein a unit root detection method is adopted to test stationarity of the time series sample data to determine whether the time series sample data is stationary, wherein the time series sample data is non-stationary, the time series sample data is to be stationary processed, wherein the series continue to be differentiated until the series meets stationary test conditions, wherein obtaining a stationary time series sample data to eliminate a data trend, wherein a differential order d of the autoregressive integrated moving average model is the times of differentiating made when the time series becomes a stationary time series.
100.The method of claim 99, wherein determining the differential order d of the model, based on the Akaike's information criterion, the ranges of both the autoregressive order p and moving average order q are defined, and wherein traversing the combination of (p, q), identifying the combination of (p, q) with a minimum Akaike's information criterion value, and wherein the optimal p, d and q are determined to apply in the autoregressive integrated moving average model for predicting.
101.The method of claim 100, wherein determining whether the actual indicator value at the pre-set time point is within the confidence interval of a predicted indicator value, obtaining a determining result;
generating the first detection result for user behavior according to the determining result, wherein when the actual indicator value falls outside the confidence interval , wherein the first detection result used for indicating the actual indicator value at the pre-set time point is an anomaly value, wherein when the actual indicator value falls within the confidence interval, the first detection result used for indicating the actual indicator value at the pre-set time point is a normal value.

Date Recue/Date Received 2024-01-11
102.The method of claim 101, wherein the self-organizing map neural network comprises an unsupervised neural network, and wherein network structure of the self-organizing map neural network includes two layers, the two layers including input layer and output layer, and wherein the output layer is a competition layer, and wherein the unsupervised neural network comprises a reverse transfer of loss function training, and wherein the neurons comprise a matrix of equidistant nodes arranged in a two-dimensional form on the self-organizing map neural network, the matrix of equidistant nodes comprising the output layer, and wherein each node includes corresponding weight vector comprising the same dimension as the dimension length of an input data and performs by means of the nearest neighbor relationship function to maintain the topology of input space.
103.The method of claim 102, wherein initializing the pre-trained self-organizing map neural network model comprises, wherein the weight of each neuron of the self-organizing map neural network can be initialized to a very small random number, the random number is greater than 0 and less than 1, wherein number of model iterations, learning rate, and neighborhood radius also need to be initialized.
104.The method of claim 103, wherein the training process of the self-organizing map neural network comprises, setting a neighborhood radius R with the winning neuron as a center, the neighborhood radius R is initialized as a initial neighborhood radius, a fixed radius is called a winning neighborhood, wherein the range of the winning neighborhood shrinks as the number of training increases and finally shrinks to a fixed value of the neighborhood radius.
105.The method of claim 104, wherein calculating the Euclidean distance between an input vector X and each neuron, the neuron with the smallest European distance to the input vector X is the winning neuron, wherein all neurons in the output layer of the self-organizing map neural network compete with each other, only one wining neuron can be activated each time.

Date Recue/Date Received 2024-01-11
106.The method of claim 104, wherein the neighborhood radius is set with the winning neuron as the center, and the area within the radius is called winning neighborhood, according to the coordinates of the winning neuron and the radius of neighborhood, determining all neurons in the winning neighborhood, and using the gradient descent method to update the weight of each neuron in the winning neighborhood.
107.The method of claim 106, wherein a new input sample is read from a training sample set, and the process is executed iteratively, until completing the training of all training samples, after updating the weight values of all wining neurons, updating learning rate and neighborhood function, wherein the number of training times of the self-organizing map neural network reaches a pre-set maximum number of times, the training and learning process is exited, obtaining the pre-trained self-organizing map neural network model, and obtaining a plurality of clusters output by the self-organizing map neural network, wherein each cluster corresponds to a neighborhood scope, wherein the neighborhood scope corresponds to the winning neighborhood, the neighborhood contains at least one neuron.
108.The method of claim 107, wherein the area threshold can be set according to a requirement of a small cluster area, wherein the small cluster area comprises an isolated cluster with a very small cluster size is set as the anomaly cluster.
109.The method of claim 108, wherein determining the neighborhood radius of a winning neighborhood where the winning neuron is located, calculating the area of circle with the radius of the neighborhood as the radius and using it as the cluster area of the cluster to which the winning neuron belongs, comparing the cluster area with the area threshold.
110.The method of claim 109, wherein when the cluster area of the cluster to which the winning neuron belongs is less than the area threshold, the second detection result is used to indicate that the user spatial series data is an anomaly data, when the cluster area of the cluster to which the winning neuron belongs is not less than the area threshold, the second detection result is used to indicate that the user spatial series data is a normal data.

Date Recue/Date Received 2024-01-11
111.The method of claim 110, wherein the order in which predicting the confidence interval of the indicator through the autoregressive integrated moving average model and performing anomaly detection through the pre-trained self-organizing map neural network model are executed concurrently.
112.The method of claim 111, wherein the first detection result and the second detection result are both normal, determining the user behavior as normal, wherein the first detection result and the second detection result are both anomaly, determining the user behavior as anomaly, wherein only one of the first detection result and the second detection result is normal, determining the user behavior as a suspicious anomaly behavior, the suspicious anomaly behavior can be manually identified.

Date Recue/Date Received 2024-01-11
CA3132346A 2020-09-29 2021-09-29 User abnormal behavior recognition method and device and computer readable storage medium Active CA3132346C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011047099.2A CN111898758B (en) 2020-09-29 2020-09-29 User abnormal behavior identification method and device and computer readable storage medium
CN202011047099.2 2020-09-29

Publications (2)

Publication Number Publication Date
CA3132346A1 CA3132346A1 (en) 2022-03-29
CA3132346C true CA3132346C (en) 2024-03-19

Family

ID=73224018

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3132346A Active CA3132346C (en) 2020-09-29 2021-09-29 User abnormal behavior recognition method and device and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN111898758B (en)
CA (1) CA3132346C (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112288571B (en) * 2020-11-24 2022-06-10 重庆邮电大学 Personal credit risk assessment method based on rapid construction of neighborhood coverage
CN112907622A (en) * 2021-01-20 2021-06-04 厦门市七星通联科技有限公司 Method, device, equipment and storage medium for identifying track of target object in video
CN113052314B (en) * 2021-05-27 2021-09-14 华中科技大学 Authentication radius guide attack method, optimization training method and system
CN113569910B (en) * 2021-06-25 2024-06-21 石化盈科信息技术有限责任公司 Account type identification method, account type identification device, computer equipment and storage medium
CN113971119B (en) * 2021-10-21 2023-02-07 云纷(上海)信息科技有限公司 Unsupervised model-based user behavior anomaly analysis and evaluation method and system
CN114742102B (en) * 2022-03-30 2023-05-30 中国人民解放军战略支援部队航天工程大学 NLOS signal identification method and system
CN114419528B (en) * 2022-04-01 2022-07-08 浙江口碑网络技术有限公司 Anomaly identification method and device, computer equipment and computer readable storage medium
CN115018053A (en) * 2022-06-16 2022-09-06 河南工业大学 Air quality monitoring data calibration method and device for self-organizing robust width network
CN115618247B (en) * 2022-09-26 2024-07-19 中电金信软件(上海)有限公司 Abnormality detection method, abnormality detection device, electronic device, and storage medium
CN115565623B (en) * 2022-10-19 2023-06-09 中国矿业大学(北京) Analysis method, system, electronic equipment and storage medium for coal geological composition
CN116204805B (en) * 2023-04-24 2023-07-21 青岛鑫屋精密机械有限公司 Micro-pressure oxygen cabin and data management system
CN117033052B (en) * 2023-08-14 2024-05-24 企口袋(重庆)数字科技有限公司 Object abnormality diagnosis method and system based on model identification
CN117034179B (en) * 2023-10-10 2024-02-02 国网山东省电力公司营销服务中心(计量中心) Abnormal electric quantity identification and tracing method and system based on graph neural network
CN117130016B (en) * 2023-10-26 2024-02-06 深圳市麦微智能电子有限公司 Personal safety monitoring system, method, device and medium based on Beidou satellite
CN117455555B (en) * 2023-12-25 2024-03-08 厦门理工学院 Big data-based electric business portrait analysis method and system
CN117828688A (en) * 2024-01-29 2024-04-05 北京亚鸿世纪科技发展有限公司 Data security processing method and system
CN118039114A (en) * 2024-02-27 2024-05-14 浙江普康智慧养老产业科技有限公司 Intelligent state monitoring method based on intelligent endowment remote monitoring system
CN117906726B (en) * 2024-03-19 2024-06-04 西安艺琳农业发展有限公司 Abnormal detection system for weight data of live cattle body ruler

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106789149B (en) * 2016-11-18 2020-08-14 北京工业大学 Intrusion detection method adopting improved self-organizing characteristic neural network clustering algorithm
CN109587713B (en) * 2018-12-05 2022-01-11 广州数锐智能科技有限公司 Network index prediction method and device based on ARIMA model and storage medium
CN111178523B (en) * 2019-08-02 2023-06-06 腾讯科技(深圳)有限公司 Behavior detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111898758A (en) 2020-11-06
CA3132346A1 (en) 2022-03-29
CN111898758B (en) 2021-03-02

Similar Documents

Publication Publication Date Title
CA3132346C (en) User abnormal behavior recognition method and device and computer readable storage medium
US10785241B2 (en) URL attack detection method and apparatus, and electronic device
US11113394B2 (en) Data type recognition, model training and risk recognition methods, apparatuses and devices
TWI673625B (en) Uniform resource locator (URL) attack detection method, device and electronic device
JP6876801B2 (en) Methods, devices, and electronics to identify risks associated with the transaction being processed
US8676726B2 (en) Automatic variable creation for adaptive analytical models
US20190130101A1 (en) Methods and apparatus for detecting a side channel attack using hardware performance counters
US20230086187A1 (en) Detection of anomalies associated with fraudulent access to a service platform
CN111353082B (en) Method, apparatus and computer readable storage medium for yield analysis
CN109981583A (en) A kind of industry control network method for situation assessment
WO2021168617A1 (en) Processing method and apparatus for service risk management, electronic device, and storage medium
Xiao et al. Self-checking deep neural networks for anomalies and adversaries in deployment
US20210264306A1 (en) Utilizing machine learning to detect single and cluster-type anomalies in a data set
Ganji et al. Shuffled shepherd political optimization‐based deep learning method for credit card fraud detection
Lim et al. More powerful selective kernel tests for feature selection
CN116305103A (en) Neural network model backdoor detection method based on confidence coefficient difference
CN115438747A (en) Abnormal account recognition model training method, device, equipment and medium
Parihar et al. IDS with deep learning techniques
CN116910682B (en) Event detection method and device, electronic equipment and storage medium
US11991037B2 (en) Systems and methods for reducing a quantity of false positives associated with rule-based alarms
EP4254228A1 (en) A training method for training an assembling model to detect a condition
US20240152604A1 (en) System and method for automatically generating playbook and verifying validity of playbook based on artificial intelligence
US11782700B2 (en) Method and system for automatic assignment of code topics
WO2023132061A1 (en) Training method, information processing device, and training program
Chen et al. On a Hybrid BiLSTM-GCNN-Based Approach for Attack Detection in SDN