Disclosure of Invention
In order to solve the technical problems mentioned in the background art, the present invention provides a method, an apparatus and a computer readable storage medium for identifying abnormal user behavior, so as to accurately and reliably identify the abnormal user behavior.
The embodiment of the invention provides the following specific technical scheme:
in a first aspect, a method for identifying abnormal behaviors of a user is provided, and the method includes:
acquiring time sequence data and space sequence data associated with the user behavior;
predicting an index confidence interval of the user at a preset time point through an ARIMA model according to a plurality of index actual values before the preset time point in the time sequence data;
comparing the actual index value of the user at the preset time point with the corresponding index confidence interval to obtain a first detection result aiming at the behavior of the user;
according to the space sequence data, carrying out anomaly detection through a pre-trained SOM neural network model to obtain a second detection result aiming at the user behavior;
and performing abnormal recognition on the user behavior according to the first detection result and the second detection result.
Further, the ARIMA model is constructed in the following way:
acquiring time series sample data associated with the behavior of a sample user;
performing stationarity test on the time sequence sample data, and performing differential processing on the time sequence sample data which is not passed through the test to obtain stationarity time sequence sample data;
aiming at the stationary time sequence sample data, establishing an initial ARIMA model, and determining the range of an autoregressive order and a moving average order of the initial ARIMA model according to the autocorrelation coefficient and the partial autocorrelation coefficient of the stationary time sequence sample data;
and determining the combination of the optimal autoregressive order and the moving average order of the initial ARIMA model by adopting an AIC information criterion, and constructing to obtain the ARIMA model.
Further, the SOM neural network model is trained by the following method:
s1, initializing the weight of each neuron in the preset SOM neural network;
s2, acquiring spatial sequence sample data associated with the behaviors of the sample user, and performing normalization processing on each spatial sequence sample data to obtain a training sample set;
s3, randomly selecting training samples from the training sample set and inputting the training samples to an input layer of the SOM neural network to obtain input vectors;
s4, searching out a winning neuron corresponding to the input vector according to the Euclidean distance between the input vector and each neuron in the competition layer of the SOM neural network;
s5, updating the weight of each neuron in the winning neuron and the neuron set in the neighborhood range by using a gradient descent method;
and S6, iteratively executing the step S3 to the step S5 until finishing training when a preset finishing condition is reached, obtaining the SOM neural network model, and obtaining a plurality of clusters output by the SOM neural network model.
Further, the obtaining a second detection result for the behavior of the user by performing anomaly detection through a pre-trained SOM neural network model according to the spatial sequence data includes:
normalizing the spatial sequence data, inputting the normalized spatial sequence data serving as input parameters into the SOM neural network model, and determining a winning neuron corresponding to the input parameters and a neighborhood to which the winning neuron belongs according to Euclidean distances from the input parameters to each neuron;
calculating a clustering area of a cluster to which the winning neuron belongs, and comparing the clustering area with an area threshold, wherein the cluster to which the winning neuron belongs is an abnormal cluster only when the clustering area is smaller than the area threshold;
and generating a second detection result aiming at the behavior of the user according to the comparison result.
Further, after performing anomaly identification on the behavior of the user according to the first detection result and the second detection result, the method further includes:
and if the identification result of the user behavior indicates that the user behavior is abnormal, performing identity authentication on the user or limiting the operation behavior of the user.
In a second aspect, an apparatus for identifying abnormal user behavior is provided, the apparatus comprising:
the data acquisition module is used for acquiring time sequence data and space sequence data which are associated with the behaviors of the user;
the first detection module is used for predicting an index confidence interval of the user at a preset time point through an ARIMA (autoregressive integrated moving average) model according to a plurality of index actual values before the preset time point in the time sequence data, comparing the index actual value of the user at the preset time point with the corresponding index confidence interval, and obtaining a first detection result aiming at the behavior of the user;
the second detection module is used for carrying out anomaly detection through a pre-trained SOM neural network model according to the spatial sequence data to obtain a second detection result aiming at the behavior of the user;
and the abnormity identification module is used for carrying out abnormity identification on the user behavior according to the first detection result and the second detection result.
Further, the apparatus further comprises a construction module, which is specifically configured to:
acquiring time series sample data associated with the behavior of a sample user;
performing stationarity test on the time sequence sample data, and performing differential processing on the time sequence sample data which is not passed through the test to obtain stationarity time sequence sample data;
aiming at the stationary time sequence sample data, establishing an initial ARIMA model, and determining the range of an autoregressive order and a moving average order of the initial ARIMA model according to the autocorrelation coefficient and the partial autocorrelation coefficient of the stationary time sequence sample data;
and determining the combination of the optimal autoregressive order and the moving average order of the initial ARIMA model by adopting an AIC information criterion, and constructing to obtain the ARIMA model.
Further, the apparatus further comprises a training module, the training module comprising:
the initialization submodule is used for initializing the weight of each neuron in a preset SOM neural network;
the preprocessing submodule is used for acquiring space sequence sample data associated with behaviors of sample users and carrying out normalization processing on each space sequence sample data to obtain a training sample set;
the training submodule is used for randomly selecting a training sample from the training sample set and inputting the training sample to an input layer of the SOM neural network to obtain an input vector, searching a winning neuron corresponding to the input vector according to the Euclidean distance between the input vector and each neuron in a competition layer of the SOM neural network, and updating the weight of each neuron in the winning neuron and a neuron set in a neighborhood range by using a gradient descent method;
and the iteration submodule is used for repeatedly executing the step of the training submodule until the training is finished when a preset finishing condition is reached, obtaining the SOM neural network model, and obtaining a plurality of clusters output by the SOM neural network model.
Further, the second detection module is specifically configured to:
normalizing the spatial sequence data, inputting the normalized spatial sequence data serving as input parameters into the SOM neural network model, and determining a winning neuron corresponding to the input vector and a cluster to which the winning neuron belongs according to Euclidean distance from the input parameters to each neuron;
calculating a clustering area of a cluster to which the winning neuron belongs, and comparing the clustering area with an area threshold, wherein the cluster to which the winning neuron belongs is an abnormal cluster only when the clustering area is smaller than the area threshold;
and generating a second detection result aiming at the behavior of the user according to the comparison result.
Further, the apparatus further includes an exception handling module, which is specifically configured to:
and if the identification result of the user behavior indicates that the user behavior is abnormal, performing identity authentication on the user or limiting the operation behavior of the user.
In a third aspect, a computer device is provided, comprising:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to perform the steps of:
acquiring time sequence data and space sequence data associated with the user behavior;
predicting an index confidence interval of the user at a preset time point through an ARIMA model according to a plurality of index actual values before the preset time point in the time sequence data;
comparing the actual index value of the user at the preset time point with the corresponding index confidence interval to obtain a first detection result aiming at the behavior of the user;
according to the space sequence data, carrying out anomaly detection through a pre-trained SOM neural network model to obtain a second detection result aiming at the user behavior;
and performing abnormal recognition on the user behavior according to the first detection result and the second detection result.
In a fourth aspect, a computer-readable storage medium is provided, on which a computer program is stored, which program, when executed by a processor, performs the operational steps of:
acquiring time sequence data and space sequence data associated with the user behavior;
predicting an index confidence interval of the user at a preset time point through an ARIMA model according to a plurality of index actual values before the preset time point in the time sequence data;
comparing the actual index value of the user at the preset time point with the corresponding index confidence interval to obtain a first detection result aiming at the behavior of the user;
according to the space sequence data, carrying out anomaly detection through a pre-trained SOM neural network model to obtain a second detection result aiming at the user behavior;
and performing abnormal recognition on the user behavior according to the first detection result and the second detection result.
Compared with the prior art, the technical scheme provided by the invention realizes the following technical effects:
1. the SOM neural network clustering algorithm has nonlinearity, robustness and strong self-adaptive learning capability, can process the outstanding capability in the aspect of uncertainty or fuzzy information, and overcomes the influence of the K-means algorithm on the limitation of a predetermined K value and the influence of noise data, so that the reliability and the accuracy of user behavior abnormity identification are improved;
2. by combining the ARIMA model with the SOM neural network model, abnormal points of user behaviors are dug in a two-way mode in time and space, and compared with a single traditional method, the method can improve the capacity of identifying the abnormal points and improve the accuracy of identifying the abnormal behaviors.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is to be understood that, unless the context clearly requires otherwise, throughout the description and the claims, the words "comprise", "comprising", and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, what is meant is "including, but not limited to".
Furthermore, in the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified.
As described in the background art, currently, for identifying abnormal behaviors of users, an unsupervised machine learning algorithm such as K-Means clustering is generally adopted, but the K-Means algorithm needs to determine the number (K) of classes in advance, and after finding a most similar class for each input data, only the parameters of the class are updated, so that the results of each time are unstable due to the influence of initial values and noise data, and the dangerous users cannot be identified accurately and reliably. Therefore, the embodiment of the invention provides a user abnormal behavior identification method, an ARIMA model and an SOM neural network model are combined, abnormal points of user behaviors are dug in a two-way mode in time and space, compared with a single traditional method, the method can improve the capacity of identifying the abnormal points and the accuracy of identifying the abnormal behaviors, meanwhile, the SOM neural network clustering algorithm has nonlinearity, robustness and strong self-adaptive learning capacity, the outstanding capacity in the aspect of processing uncertainty or fuzzy information can be realized, and the influence of the K-means algorithm on the limitation of a predetermined K value and the influence of noise data is overcome.
Example one
The embodiment of the invention provides a user abnormal behavior identification method, which is applied to a user abnormal behavior identification device, wherein the device can be configured in any computer equipment, and the computer equipment can be a server, and the server can be an independent server or a server cluster consisting of a plurality of servers.
As shown in fig. 1, the method for identifying abnormal user behavior according to the embodiment of the present invention may include the following steps:
and 101, acquiring time sequence data and space sequence data associated with the user behavior.
Specifically, user data in a preset time period may be acquired, the user data may be preprocessed, and time series data and spatial series data associated with a user behavior may be extracted.
The user data includes user attribute data and user behavior data, and the user attribute data may include: name, age, communication address, etc.; the user behavior data may include an IP address of an account registration place, an IP address at each login, time information of each login, page click information, user equipment information, online time and other related information, and the user equipment information may include information such as a device MAC address, device gyroscope data, device acceleration data, a CPU, a memory, a disk I/O and the like.
The time sequence data is an index value sequence obtained by sequencing actual index values of the user in a preset time period according to the time sequence. The index value refers to a parameter index value obtained by counting numerical data related to user behaviors in a preset time period. The parameter index may be one of an online time, a device moving distance, and a screen temperature change value, and may further include other indexes.
The space sequence data refers to behavior track data with space sequence of the user on the application, and the spaces are connected with each other in sequence, streamline and direction, for example, the behavior track data related to the transfer operation of the user logging in the application forms the space sequence data of the user.
And 102, predicting an index confidence interval of the user at the preset time point through an ARIMA model according to a plurality of index actual values before the preset time point in the time sequence data.
The preset time point may be a time point corresponding to the nth data in M data included in the time series data, where N is greater than 1, and N is less than or equal to M.
Specifically, a plurality of actual index values before a preset time point in the time series data are substituted into the ARIMA model for prediction, so that an index predicted value at the preset time point and a confidence interval of the index predicted value when the confidence is ⍺ are obtained.
The ARIMA (autoregressive Integrated Moving Average model) is an autoregressive Integrated Moving Average model, the past and present values are used for predicting the future, the time sequence is regarded as a random sequence, and an optimal function is found to fit the random sequence.
Wherein the ARIMA (p, q, d) model is defined as follows:
wherein p is the order of autoregressive, d is the order of sequence difference, q is the order of moving average,
is the observed value of the time series at time t,
is a white noise sequence and is a white noise sequence,
、
are respectively as
And
the coefficient of (a).
Further, the ARIMA model can be constructed through the following steps a to d:
and a, acquiring time series sample data associated with the behavior of the sample user.
Specifically, the implementation process of this step may refer to the acquisition process of the time series data in step 101, and is not described herein again.
And b, performing stationarity test on the time sequence sample data, and performing differential processing on the time sequence sample data which is not passed through the test to obtain the stationarity time sequence sample data.
Specifically, a unit root inspection method is adopted to inspect the stationarity of time sequence sample data, whether the data is stationarity data or not is judged, if the data is non-stationarity data, the data needs to be stationarity processed, namely, the sequences are continuously differentiated until the differentiated sequences meet stationarity detection, and stationarity time sequence sample data is obtained, so that the trend of the data is eliminated, and the differential order d of the ARIMA model is the difference times when the time sequence becomes the stationarity time sequence.
And c, establishing an initial ARIMA model aiming at the stationary time sequence sample data, and determining the range of the autoregressive order and the moving average order of the initial ARIMA model according to the autocorrelation coefficient and the partial autocorrelation coefficient of the stationary time sequence sample data.
And d, determining the combination of the optimal autoregressive order and the moving average order of the initial ARIMA model by adopting an AIC information criterion, and constructing to obtain the ARIMA model.
Specifically, the difference order d of the model is determined, the range of the autoregressive order p and the moving average order q is defined by taking the AIC information criterion as the standard, and the combination of (p, q) is traversed to find out the combination of (p, q) with the minimum AIC value. And finally, applying the determined optimal p, d and q to the ARIMA model for prediction.
103, comparing the actual index value of the user at the preset time point with the corresponding index confidence interval to obtain a first detection result for the behavior of the user.
Specifically, whether the actual value of the index at the preset time point is within the predicted index confidence interval is judged to obtain a judgment result, and a first detection result for the behavior of the user is generated according to the judgment result, wherein when the actual value of the index falls outside the index confidence interval, the first detection result is used for indicating that the actual value of the index at the preset time point is an abnormal value, and when the actual value of the index falls within the index confidence interval, the first detection result is used for indicating that the actual value of the index at the preset time point is a normal value.
And 104, carrying out abnormity detection through a pre-trained SOM neural network model according to the spatial sequence data to obtain a second detection result aiming at the behavior of the user.
Among them, the SOM (Self Organizing mapping neural network) is an unsupervised artificial neural network. The network structure of SOM has 2 layers: input layer, output layer (also called contention layer). Usually, a neural network is trained based on reverse transfer of a loss function, SOM (state of health) utilizes a competitive learning strategy, and gradually optimizes the network by depending on mutual competition among neurons, and the neurons form an equidistant node matrix on the neural network in a two-dimensional form to form an output layer; each node has a corresponding weight vector, the dimension of which is equal to the dimension length of the input data, and a neighbor relation function is used to maintain the topology of the input space.
The SOM neural network model can be obtained by training in the following mode, and the method comprises the following steps of S1-S6:
and S1, initializing the weight of each neuron in the preset SOM neural network.
Specifically, a preset SOM neural network is initialized, and the weight of each neuron of the SOM neural network may be initialized to a small random number, which is greater than 0 and less than 1. In addition, the number of iterations, the learning rate, and the neighborhood radius of the model also need to be initialized, for example, the number of iterations i =1000 and the initial learning rate \/u may be setmax =0.2,rate_min=0.05, initial neighborhood radius zone _ \max =1.5,zone_min =0.8, each model parameter can be adjusted correspondingly according to different data or requirements, an excessively small learning rate can reduce the speed of network optimization, increase training time, and an excessively large learning rate can cause network parameters to swing back and forth on both sides of a final optimum value, resulting in network convergence failure. In the specific implementation process, when the training of the SOM neural network starts, the learning rate value is selected to be 0.2, and then the learning rate value is decreased at a higher speed, so that the approximate structure of the input vector can be captured quickly, and when the learning rate value is decreased to be a smaller value, the weight value of the neuron can be adjusted to be in accordance with the sample distribution structure of the input space. In addition, in the training process of the SOM neural network, a neighborhood radius R is set by taking a winning neuron as a center, the neighborhood radius R is initialized to an initial neighborhood radius, and a range with a fixed radius is called as a winning neighborhood. The range of the winning neighborhood is continuously shrunk along with the increase of the training times, and finally, the radius of the neighborhood is shrunk to be a fixed value.
And S2, acquiring spatial sequence sample data associated with the behaviors of the sample user, and performing normalization processing on each spatial sequence sample data to obtain a training sample set.
The process of acquiring the spatial sequence sample data may refer to the process of acquiring the time series data in step 101, and is not described herein again.
And S3, randomly selecting training samples from the training sample set and inputting the training samples to an input layer of the SOM neural network to obtain input vectors.
And S4, searching out a winning neuron corresponding to the input vector according to the Euclidean distance between the input vector and each neuron in the competition layer of the SOM neural network.
Specifically, the euclidean distance between an input vector X and each neuron is calculated, and the neuron with the smallest euclidean distance with the input vector X is determined as a winning neuron. All neurons of the output layer of the SOM neural network compete with each other and only one winning neuron can be activated at a time.
And S5, updating the weight of each neuron in the neuron set in the winning neuron and the neighborhood range thereof by using a gradient descent method.
Specifically, a neighborhood radius is set by taking a winning neuron as a center, a region in the radius range is called a winning neighborhood, all neurons in the winning neighborhood are determined according to coordinates of the winning neuron and the neighborhood radius, and each neuron in the winning neighborhood is subjected to weight updating by adopting a gradient descent method.
And S6, iteratively executing the step S3 to the step S5 until the training is finished when a preset finishing condition is reached, obtaining the SOM neural network model, and obtaining a plurality of clusters output by the SOM neural network model.
Specifically, new input samples are read from the training sample set, the processes of step S3 to step S5 are iteratively performed until training of all training samples is completed, and the learning rate and the neighborhood function are updated after the weight values of all winning neurons are updated. And when the training times of the SOM neural network reach the preset maximum times, exiting the training learning process to obtain a trained SOM neural network model, and acquiring a plurality of clusters output by the SOM neural network model, wherein each cluster corresponds to a neighborhood range (namely a winning neighborhood) and the neighborhood range comprises at least one neuron.
In the embodiment, the correlation existing among the influencing factors in the spatial sequence data is excavated by utilizing the SOM neural network, so that the classification and the research of the abnormal behaviors of the user are facilitated, and the generalization capability is high.
The implementation process of step 104 may include:
1041, normalizing the spatial sequence data, and inputting the normalized spatial sequence data as an input parameter into the SOM neural network model.
1042, according to the Euclidean distance from the input parameters to each neuron, determining a winning neuron corresponding to the input parameters and a cluster to which the winning neuron belongs.
Specifically, the Euclidean distance between an input vector X and each neuron is calculated, the neuron with the minimum Euclidean distance with the input vector X is determined to be a winning neuron, and the neighborhood to which the winning neuron belongs is determined.
1043, calculating a clustering area of the cluster to which the winning neuron belongs, and comparing the clustering area with an area threshold, wherein the cluster to which the winning neuron belongs is an abnormal cluster only when the clustering area is smaller than the area threshold.
The area threshold value can be set according to actual needs, and when the clustering area is small, namely, the isolated cluster with small clustering scale is set as an abnormal cluster.
Specifically, the neighborhood radius of the winning neuron in the winning neighborhood is determined, the area of a circle with the neighborhood radius as the radius is calculated and is used as the clustering area of the cluster to which the winning neuron belongs, and the clustering area is compared with an area threshold.
1044 generating a second detection result for the behavior of the user according to the result of the comparison.
When the clustering area of the cluster to which the winning neuron belongs is not less than the area threshold, the second detection result is used for indicating that the spatial sequence data of the user are abnormal data.
It should be noted that, in the embodiment of the present invention, the order of executing step 102 and step 104 is not particularly limited, and it is preferable that the steps are executed simultaneously.
And 105, performing abnormity identification on the behavior of the user according to the first detection result and the second detection result.
Specifically, the user's behavior may be abnormally identified as follows:
if the first detection result and the second detection result are both normal, determining the behavior of the user to be normal; if the first detection result and the second detection result are both abnormal, determining the behavior of the user as abnormal; if only one of the first detection result and the second detection result is normal, the behavior of the user is determined to be suspicious abnormality, and the behavior of the suspicious abnormality can be identified in a manual mode.
Further, after step 105, the method may further comprise:
and if the identification result of the user behavior indicates that the user behavior is abnormal, performing identity authentication on the user or limiting the operation behavior of the user.
Wherein restricting operations includes disabling critical functions on a key page on the application, the critical functions including but not limited to viewing, entering, submitting, and the like.
In this embodiment, after the user is determined to be a risk user, the network security risk can be effectively controlled and prevented by performing identity authentication on the user or performing corresponding limiting operation on the user.
According to the user abnormal behavior identification method provided by the embodiment of the invention, the SOM neural network clustering algorithm has nonlinearity, robustness and strong self-adaptive learning capacity, the outstanding capacity in the aspect of uncertainty or fuzzy information can be processed, and the influence of the limitation effect of a predetermined K value and the influence of noise data on the K-means algorithm is overcome, so that the reliability and the accuracy of user behavior abnormal identification are improved; in addition, by combining the ARIMA model with the SOM neural network model, abnormal points of user behaviors are dug in a two-way mode in time and space, and compared with a single traditional method, the method can improve the capacity of identifying the abnormal points and improve the accuracy of identifying the abnormal behaviors.
Example two
An embodiment of the present invention provides a device for identifying an abnormal behavior of a user, and as shown in fig. 2, the device includes:
a data obtaining module 202, configured to obtain time series data and space series data associated with a behavior of a user;
the first detection module 204 is configured to predict an index confidence interval of the user at a preset time point through an ARIMA model according to a plurality of index actual values before the preset time point in the time series data, and compare the index actual value of the user at the preset time point with the corresponding index confidence interval to obtain a first detection result for a behavior of the user;
the second detection module 206 is configured to perform anomaly detection through a pre-trained SOM neural network model according to the spatial sequence data to obtain a second detection result for the behavior of the user;
and the anomaly identification module 208 is used for performing anomaly identification on the behavior of the user according to the first detection result and the second detection result.
Further, the apparatus further comprises a construction module, the construction module specifically configured to:
acquiring time series sample data associated with the behavior of a sample user;
performing stationarity test on time sequence sample data, and performing differential processing on the time sequence sample data which is not passed through the test to obtain stationarity time sequence sample data;
aiming at stationary time sequence sample data, establishing an initial ARIMA model, and determining the range of an autoregressive order and a moving average order of the initial ARIMA model according to an autocorrelation coefficient and a partial autocorrelation coefficient of the stationary time sequence sample data;
and determining the combination of the optimal autoregressive order and the moving average order of the initial ARIMA model by adopting an AIC information criterion, and constructing to obtain the ARIMA model.
Further, the apparatus further comprises a training module, the training module comprising:
the initialization submodule is used for initializing the weight of each neuron in a preset SOM neural network;
the preprocessing submodule is used for acquiring spatial sequence sample data associated with the behaviors of the sample users and carrying out normalization processing on each spatial sequence sample data to obtain a training sample set;
the training submodule is used for randomly selecting a training sample from the training sample set to be input into an input layer of the SOM neural network to obtain an input vector, searching out a winning neuron corresponding to the input vector according to the Euclidean distance between the input vector and each neuron in a competition layer of the SOM neural network, and updating the weight of each neuron in a neuron set in the winning neuron and a neighborhood range thereof by using a gradient descent method;
and the iteration submodule is used for repeatedly executing the step of the training submodule until the training is finished when a preset finishing condition is reached, obtaining the SOM neural network model, and obtaining a plurality of clusters output by the SOM neural network model.
Further, the second detection module 206 is specifically configured to:
normalizing the spatial sequence data, inputting the normalized spatial sequence data serving as input parameters into an SOM neural network model, and determining a winning neuron and a neighborhood to which the winning neuron belongs according to Euclidean distance from the input parameters to each neuron;
calculating the clustering area of the cluster to which the winning neuron belongs, and comparing the clustering area with an area threshold value, wherein the cluster to which the winning neuron belongs is an abnormal cluster only when the clustering area is smaller than the area threshold value;
and generating a second detection result aiming at the behavior of the user according to the comparison result.
Further, the apparatus further includes an exception handling module, which is specifically configured to:
and if the identification result of the user behavior indicates that the user behavior is abnormal, performing identity authentication on the user or limiting the operation behavior of the user.
The user abnormal behavior recognition device provided by the embodiment of the invention belongs to the same invention concept as the user abnormal behavior recognition method provided by the embodiment of the invention, can execute the user abnormal behavior recognition method provided by the embodiment of the invention, and has the corresponding functional module and the beneficial effect of executing the user abnormal behavior recognition method. For details of the user abnormal behavior identification method provided in the embodiment of the present invention, reference may be made to the technical details not described in detail in the embodiment of the present invention, and details are not repeated here.
Fig. 3 is an internal structural diagram of a computer device according to an embodiment of the present invention. The computer device may be a server, and its internal structure diagram may be as shown in fig. 3. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of user anomalous behavior recognition.
Those skilled in the art will appreciate that the configuration shown in fig. 3 is a block diagram of only a portion of the configuration associated with aspects of the present invention and is not intended to limit the computing devices to which aspects of the present invention may be applied, and that a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
acquiring time sequence data and space sequence data associated with the user behavior;
according to a plurality of index actual values before a preset time point in the time sequence data, an index confidence interval of a user at the preset time point is predicted through an ARIMA model;
comparing the actual index value of the user at a preset time point with the corresponding index confidence interval to obtain a first detection result aiming at the behavior of the user;
according to the spatial sequence data, carrying out anomaly detection through a pre-trained SOM neural network model to obtain a second detection result aiming at the user behavior;
and performing abnormal recognition on the user behavior according to the first detection result and the second detection result.
In one embodiment, there is also provided a computer readable storage medium having a computer program stored thereon, the computer program when executed by a processor implementing the steps of:
acquiring time sequence data and space sequence data associated with the user behavior;
according to a plurality of index actual values before a preset time point in the time sequence data, an index confidence interval of a user at the preset time point is predicted through an ARIMA model;
comparing the actual index value of the user at a preset time point with the corresponding index confidence interval to obtain a first detection result aiming at the behavior of the user;
according to the spatial sequence data, carrying out anomaly detection through a pre-trained SOM neural network model to obtain a second detection result aiming at the user behavior;
and performing abnormal recognition on the user behavior according to the first detection result and the second detection result.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, databases, or other media used in embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only show some embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.