CN113949652B - User abnormal behavior detection method and device based on artificial intelligence and related equipment - Google Patents

User abnormal behavior detection method and device based on artificial intelligence and related equipment Download PDF

Info

Publication number
CN113949652B
CN113949652B CN202111185035.3A CN202111185035A CN113949652B CN 113949652 B CN113949652 B CN 113949652B CN 202111185035 A CN202111185035 A CN 202111185035A CN 113949652 B CN113949652 B CN 113949652B
Authority
CN
China
Prior art keywords
user
abnormal
value
data
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111185035.3A
Other languages
Chinese (zh)
Other versions
CN113949652A (en
Inventor
王忠玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Puhui Enterprise Management Co Ltd
Original Assignee
Ping An Puhui Enterprise Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Puhui Enterprise Management Co Ltd filed Critical Ping An Puhui Enterprise Management Co Ltd
Priority to CN202111185035.3A priority Critical patent/CN113949652B/en
Publication of CN113949652A publication Critical patent/CN113949652A/en
Application granted granted Critical
Publication of CN113949652B publication Critical patent/CN113949652B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • H04L63/0245Filtering by information in the payload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information
    • H04L63/123Applying verification of the received information received data contents, e.g. message integrity

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention relates to the technical field of artificial intelligence, and provides a method, a device and related equipment for detecting abnormal user behaviors based on artificial intelligence, wherein the method comprises the following steps: calling an interface of a target data source to obtain first data of each user; preprocessing the first data to obtain second data; extracting a first feature set and a second feature set from the second data; performing anomaly detection on each user by adopting a plurality of preset algorithms to obtain a first target anomaly value and a second target anomaly value; determining an abnormal behavior user based on the first target abnormal value and the second target abnormal value; and performing correlation analysis on the abnormal behavior user to obtain an abnormal behavior detection result of the abnormal behavior user. According to the invention, through adopting a plurality of preset algorithms, each user is subjected to abnormal detection from a plurality of dimensions, and the abnormal behavior user is subjected to correlation analysis from a plurality of dimensions, so that the accuracy and the integrity of the abnormal behavior detection result are improved.

Description

User abnormal behavior detection method and device based on artificial intelligence and related equipment
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method and a device for detecting abnormal user behaviors based on artificial intelligence and related equipment.
Background
The user entity behavior analysis system is used as an emerging abnormal user detection means, and potential events related to activities with abnormal user or entity standard figures or behaviors are discovered by providing figures and based on various analysis methods (machine learning and the like).
However, the existing user entity behavior analysis system adopts a single algorithm to perform anomaly detection, is limited by the algorithm, has certain limitations due to the single algorithm, and cannot well find out an actual anomaly value due to the influence of algorithm parameters, so that the accuracy of the user anomaly detection result is low.
Therefore, it is necessary to provide a method for quickly and accurately detecting the abnormal behavior of the user.
Disclosure of Invention
In view of the above, it is necessary to provide a method, an apparatus, and a related device for detecting abnormal behavior of a user based on artificial intelligence, in which a plurality of preset algorithms are used to detect abnormality of each user from a plurality of dimensions, and perform correlation analysis on users with abnormal behavior from a plurality of dimensions, so as to improve accuracy and integrity of an abnormal behavior detection result.
The first aspect of the invention provides a method for detecting abnormal user behavior based on artificial intelligence, which comprises the following steps:
analyzing the received user abnormal behavior detection request to obtain a target data source;
calling an interface of the target data source, and acquiring first data of each user based on the interface;
preprocessing the first data of each user to obtain second data of each user;
extracting a first characteristic set and a second characteristic set of each user from the second data of each user according to a preset extraction rule;
performing anomaly detection on the first feature set of each user by adopting a plurality of preset algorithms to obtain a first target anomaly value of each user, and performing anomaly detection on the second feature set of each user by adopting a plurality of preset algorithms to obtain a second target anomaly value of each user;
determining abnormal behavior users based on the first target abnormal value and the second target abnormal value of each user;
and performing correlation analysis on the abnormal behavior user to obtain an abnormal behavior detection result of the abnormal behavior user.
Optionally, the invoking an interface of the target data source, and the obtaining first data of each user based on the interface includes:
identifying a service system corresponding to the target data source, and acquiring a plurality of preset interfaces from the service system;
the data of the preset interfaces are consumed and filtered in real time through the kafka to obtain a plurality of filtered data;
adopting regular matching to the plurality of filtering data, and judging whether each filtering data contains a sensitive field;
when each piece of filtering data contains a sensitive field, recording each piece of filtering data containing the sensitive field, wherein the sensitive field contains one or more sensitive fields;
identifying whether a first user ID exists in the request corresponding to each filtering data containing the sensitive field;
when a first user ID exists in the request corresponding to each filtering data containing the sensitive field, associating the same first user ID with the corresponding filtering data containing the sensitive field, and determining the associated filtering data containing the sensitive field as the first data of each user; or
And when the first user ID does not exist in the request corresponding to each filtering data containing the sensitive field, acquiring the IP of the request corresponding to each filtering data containing the sensitive field, acquiring a second user ID using the IP, associating the same second user ID with the corresponding filtering data containing the sensitive field, and determining the associated filtering data containing the sensitive field as the first data of each user.
Optionally, the extracting the first feature set and the second feature set of each user from the second data of each user according to a preset extraction rule includes:
extracting third data in a preset time period from the second data of each user;
separating the third data according to the working day and the non-working day in the preset time period to obtain third data corresponding to the working day and third data corresponding to the non-working day;
and extracting the first feature set of each user from the third data corresponding to the working days according to a preset extraction rule, and summarizing and extracting the second feature set of each user from the third data corresponding to the non-working days according to a preset extraction rule.
Optionally, the performing, by using a plurality of preset algorithms, abnormality detection on the first feature set of each user to obtain a first target abnormal value of each user includes:
performing anomaly detection on the first feature set of each user by adopting a preset isolated forest algorithm to obtain a first anomaly value of each user;
performing anomaly detection on the first feature set of each user by adopting a preset difference integration moving average autoregressive algorithm to obtain a second anomaly value of each user;
respectively carrying out normalization processing on the first abnormal value and the second abnormal value to obtain a first probability and a second probability of each user;
calculating the product of the first probability of each user and a first weight value corresponding to the preset isolated forest algorithm to obtain a first product, and calculating the product of the second probability of each user and a second weight value corresponding to the preset difference integration moving average autoregressive algorithm to obtain a second product;
and calculating the sum of the first product and the second product to obtain a first target abnormal value of each user.
Optionally, the performing, by using the preset multiple algorithms, abnormality detection on the second feature set of each user to obtain a second target abnormal value of each user includes:
performing anomaly detection on the second feature set of each user by adopting the isolated forest algorithm to obtain a third anomaly value of each user;
performing anomaly detection on the second feature set of each user by adopting the difference integration moving average autoregressive algorithm to obtain a fourth anomaly value of each user;
respectively carrying out normalization processing on the third abnormal value and the fourth abnormal value to obtain a third probability and a fourth probability of each user;
calculating the product of the third probability of each user and a third weight value corresponding to the isolated forest algorithm to obtain a third product, and calculating the product of the fourth probability of each user and a fourth weight value corresponding to the difference integration moving average autoregressive algorithm to obtain a fourth product;
and calculating the sum of the third product and the fourth product to obtain a second target abnormal value of each user.
Optionally, the determining the abnormal behavior user based on the first target abnormal value and the second target abnormal value of each user includes:
comparing the first target abnormal value of each user with a preset first target abnormal threshold value, and comparing the second target abnormal value of each user with a preset second target abnormal threshold value;
when the first target abnormal value of each user is greater than or equal to a preset first target abnormal threshold value, and/or the second target abnormal value of each user is greater than or equal to a preset second target abnormal threshold value, determining each user as an abnormal behavior user; or
And when the first target abnormal value of each user is smaller than the preset first target abnormal threshold value and the second target abnormal value of each user is smaller than the preset second target abnormal threshold value, determining that each user is a normal behavior user.
Optionally, the performing of the association analysis on the abnormal behavior user includes one or more of the following ways:
identifying whether the account of the abnormal behavior user is a shared account; or
Identifying whether the abnormal behavior user has VPN authority or not; or
Identifying the number of the abnormal behavior users hitting preset rules; or
Identifying whether the abnormal behavior user gives up a job; or alternatively
Identifying whether the abnormal behavior user has the authority of accessing sensitive information; or
And identifying whether the account of the abnormal behavior user is lost.
A second aspect of the present invention provides an artificial intelligence-based user abnormal behavior detection apparatus, the apparatus comprising:
the analysis module is used for analyzing the received user abnormal behavior detection request to obtain a target data source;
the calling module is used for calling an interface of the target data source and acquiring first data of each user based on the interface;
the preprocessing module is used for preprocessing the first data of each user to obtain second data of each user;
the extraction module is used for extracting the first characteristic set and the second characteristic set of each user from the second data of each user according to a preset extraction rule;
the anomaly detection module is used for performing anomaly detection on the first feature set of each user by adopting a plurality of preset algorithms to obtain a first target anomaly value of each user and performing anomaly detection on the second feature set of each user by adopting a plurality of preset algorithms to obtain a second target anomaly value of each user;
the determining module is used for determining abnormal behavior users based on the first target abnormal value and the second target abnormal value of each user;
and the association analysis module is used for performing association analysis on the abnormal behavior user to obtain an abnormal behavior detection result of the abnormal behavior user.
A third aspect of the present invention provides an electronic device, comprising a processor and a memory, wherein the processor is configured to implement the artificial intelligence based user abnormal behavior detection method when executing a computer program stored in the memory.
A fourth aspect of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the artificial intelligence based user abnormal behavior detection method.
In summary, the method, the device and the related equipment for detecting the abnormal user behavior based on the artificial intelligence, which are disclosed by the invention, have more pertinence by determining the target data source and acquiring the first data of each user through the interface of the target data source, and improve the accuracy of the acquired first data. The first data are preprocessed, then the first feature set and the second feature set of each user are extracted, the first data are preprocessed, data formats are unified, and consistency of data formats for detecting abnormal behaviors of the users is guaranteed. And the first characteristic set and the second characteristic set are respectively subjected to anomaly detection by adopting a plurality of preset algorithms, anomaly detection is performed from a plurality of dimensions, and the accuracy of the target anomaly value of each user is ensured. And determining abnormal behavior users according to the detected target abnormal values, and performing correlation analysis on the abnormal behavior users from multiple dimensions to obtain abnormal behavior detection results, so that the abnormal behavior detection is more accurate, and more accurate and complete abnormal behavior detection results are obtained.
Drawings
Fig. 1 is a flowchart of a method for detecting abnormal user behavior based on artificial intelligence according to an embodiment of the present invention.
Fig. 2 is a structural diagram of an apparatus for detecting abnormal user behavior based on artificial intelligence according to a second embodiment of the present invention.
Fig. 3 is a schematic structural diagram of an electronic device according to a third embodiment of the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a detailed description of the present invention will be given below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflict.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
Example one
Fig. 1 is a flowchart of a method for detecting abnormal user behavior based on artificial intelligence according to an embodiment of the present invention.
In this embodiment, the method for detecting abnormal user behavior based on artificial intelligence may be applied to an electronic device, and for an electronic device that needs to perform abnormal user behavior based on artificial intelligence, the function of detecting abnormal user behavior based on artificial intelligence provided by the method of the present invention may be directly integrated on the electronic device, or may be run in the electronic device in the form of a Software Development Kit (SDK).
The embodiment of the invention can acquire and process related data based on an artificial intelligence technology. Among them, artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning, deep learning and the like.
As shown in fig. 1, the method for detecting abnormal user behavior based on artificial intelligence specifically includes the following steps, and the order of the steps in the flowchart may be changed, and some steps may be omitted according to different requirements.
In the embodiment, under the high-speed development of the digital era, enterprise information and network security face more and more challenges, in the aspects of enterprise information security and management, malicious or careless employees, lost accounts and lost hosts may cause enterprise information leakage, in order to improve the security of enterprise information, footprint data of each user in an enterprise system needs to be monitored in real time, abnormal monitoring is carried out on the footprint data of each user by adopting a plurality of preset algorithms, abnormal behavior users are determined according to detected target abnormal values, and correlation analysis is carried out on the abnormal behavior users from a plurality of dimensions to obtain abnormal behavior detection results, so that the abnormal behavior detection is more accurate, and more accurate and complete abnormal behavior detection results are obtained to determine the abnormal users.
And S11, analyzing the received user abnormal behavior detection request to acquire a target data source.
In this embodiment, when detecting an abnormal user behavior, a user initiates a user abnormal behavior detection request to a server through a client, specifically, the client may be a smart phone, an IPAD, or other existing intelligent devices, the server may be a user abnormal behavior detection subsystem, and in a user abnormal behavior detection process, if the client sends the user abnormal behavior detection request to the user abnormal behavior detection subsystem, the user abnormal behavior detection subsystem is configured to receive the user abnormal behavior detection request sent by the client.
In this embodiment, when the server receives the user abnormal behavior detection request, the server parses the abnormal behavior detection request to obtain a target data source, and specifically, the target data source is used to represent a detected service scenario, that is, to detect an abnormal internal user accessing a service system corresponding to the service scenario.
And S12, calling an interface of the target data source, and acquiring first data of each user based on the interface.
In this embodiment, one target data source has multiple interfaces, the interfaces are used to record access footprints, and first data of a user accessing the interfaces can be acquired through the interfaces.
In an optional embodiment, the invoking the interface of the target data source, and the acquiring the first data of each user based on the interface includes:
identifying a service system corresponding to the target data source, and acquiring a plurality of preset interfaces from the service system;
the data of the preset interfaces are consumed and filtered in real time through the kafka to obtain a plurality of filtered data;
adopting regular matching to the plurality of filtering data, and judging whether each filtering data contains a sensitive field;
when each piece of filtering data contains a sensitive field, recording each piece of filtering data containing the sensitive field, wherein the sensitive field contains one or more sensitive fields;
identifying whether a first user ID exists in the request corresponding to each filtering data containing the sensitive field;
when a first user ID exists in the request corresponding to each filtering data containing the sensitive field, associating the same first user ID with the corresponding filtering data containing the sensitive field, and determining the associated filtering data containing the sensitive field as the first data of each user; or
And when the first user ID does not exist in the request corresponding to each filtering data containing the sensitive field, acquiring the IP of the request corresponding to each filtering data containing the sensitive field, acquiring a second user ID using the IP, associating the same second user ID with the corresponding filtering data containing the sensitive field, and determining the associated filtering data containing the sensitive field as the first data of each user.
In this embodiment, a preset interface may be preset, specifically, the preset interface is used to represent an interface related to a sensitive field, generally, a service system may include 700 interfaces related to sensitive fields, and the interfaces are generally stored in a list.
In this embodiment, data of a plurality of preset interfaces are consumed and filtered in real time by the kafka, and whether a sensitive field exists in each filtered data is judged by regular matching, for example, the sensitive field includes a mobile phone number or an order number, and the filtered data including sensitive information is recorded.
In this embodiment, since the user ID can be obtained from the recorded filtering data containing the sensitive information, the operating user in the record can be determined by recording the filtering data containing the sensitive information, which is convenient for performing subsequent user abnormal behavior detection.
In this embodiment, the rule for obtaining the user ID is as follows: according to the method, the device and the system, the IP in the request does not need to be acquired according to whether the first user ID exists in the request corresponding to each filtering data containing the sensitive field or not, and if the first user ID does not exist, the second user ID needs to be acquired according to the request IP corresponding to each filtering data containing the sensitive field, so that the problem that the second user ID of the record is not acquired and the user is missed due to the fact that the record is directly deleted when the first user ID does not exist is solved, and the integrity of the acquired user is improved.
In other embodiments, if logs of multiple service systems are mixed together, because the interface patterns of different systems are different, the data of each service system can be distinguished through the interface patterns, so that the accuracy of the acquired first data is improved.
And S13, preprocessing the first data of each user to obtain second data of each user.
In this embodiment, because the first data may have missing data, redundant data, or data with a non-uniform format, after the first data is obtained, the first data is preprocessed, specifically, the preprocessing includes data cleaning and data format conversion on the first data.
In an optional embodiment, the preprocessing the first data of each user to obtain the second data of each user includes:
and carrying out data cleaning on the first data of each user, and carrying out format conversion on the cleaned first data according to a preset format conversion rule to obtain second data of each user.
Specifically, the data cleaning comprises one or more of the following modes in combination: the method comprises the following steps of missing value cleaning, format content cleaning, logic error cleaning and non-demand data cleaning.
In this embodiment, because the data formats acquired by the plurality of preset interfaces are different, the format conversion is performed on the cleaned first data by using the preset format conversion rule, the format of the second data is unified, the consistency of the data format subsequently used for detecting the abnormal behavior of the user is ensured, and the efficiency and accuracy of detecting the abnormal behavior of the user are further improved.
And S14, extracting the first feature set and the second feature set of each user from the second data of each user according to a preset extraction rule.
In this embodiment, an extraction rule may be preset, specifically, the preset extraction rule is set according to an abnormal detection requirement in the user abnormal behavior detection request, for example, the preset extraction rule is to extract data of an interface of a sensitive system accessed by a user within 90 days, count an access amount per hour, and establish an access baseline for 90 days.
In an optional embodiment, the extracting the first feature set and the second feature set of each user from the second data of each user according to a preset extraction rule includes:
extracting third data in a preset time period from the second data of each user;
separating the third data according to the working days and the non-working days in the preset time period to obtain third data corresponding to the working days and third data corresponding to the non-working days;
and extracting the first feature set of each user from the third data corresponding to the working days according to a preset extraction rule, and summarizing and extracting the second feature set of each user from the third data corresponding to the non-working days according to a preset extraction rule.
In this embodiment, in data extraction, differences between visits of working days and non-working days are considered, so that data of the working days and the non-working days are separated during statistics to obtain a first feature set of each user in the working days and a second feature set of each user in the non-working days.
S15, carrying out anomaly detection on the first feature set of each user by adopting a plurality of preset algorithms to obtain a first target anomaly value of each user, and carrying out anomaly detection on the second feature set of each user by adopting a plurality of preset algorithms to obtain a second target anomaly value of each user.
In this embodiment, the feature value of each user includes the first feature set and the second feature set, and a plurality of algorithms, for example, an isolated forest algorithm and a difference-integrated moving average autoregressive algorithm, may be preset.
In this embodiment, the target abnormal value of each user includes a first target abnormal value and a second target abnormal value, where the first target abnormal value refers to an abnormal score that occurs when each user accesses the service system on a working day, and the second target abnormal value refers to an abnormal score that occurs when each user accesses the service system on a non-working day.
In an optional embodiment, the performing, by using a plurality of preset algorithms, abnormality detection on the first feature set of each user to obtain a first target abnormal value of each user includes:
performing anomaly detection on the first feature set of each user by adopting a preset isolated forest algorithm to obtain a first anomaly value of each user;
performing anomaly detection on the first feature set of each user by adopting a preset difference integration moving average autoregressive algorithm to obtain a second anomaly value of each user;
respectively carrying out normalization processing on the first abnormal value and the second abnormal value to obtain a first probability and a second probability of each user;
calculating the product of the first probability of each user and a first weight value corresponding to the preset isolated forest algorithm to obtain a first product, and calculating the product of the second probability of each user and a second weight value corresponding to the preset difference integration moving average autoregressive algorithm to obtain a second product;
and calculating the sum of the first product and the second product to obtain a first target abnormal value of each user.
In the embodiment, the abnormal value of each user is weighted and calculated by combining an isolated forest algorithm and a difference integration moving average autoregressive algorithm, so that the first target abnormal value of each user is obtained.
Specifically, the preset isolated forest is based on the basic principle that abnormal samples can be segmented and isolated from common samples through random features of fewer times, and the isolated forest algorithm training process comprises the following steps: 1) Randomly selecting y points from a plurality of acquired first feature sets of a plurality of users as subsamples, and putting the subsamples into a root node of a preset isolated tree; 2) Randomly appointing a dimension, and randomly generating a cutting point p in the range of the current node data, wherein the cutting point is generated between the maximum value and the minimum value of the appointed dimension in the current node data; 3) Generating a hyperplane according to the selection of the cutting point p, dividing the data space of the current node into 2 subspaces, placing the points smaller than p in the currently selected dimension on the left branch of the current node, and placing the points larger than or equal to p on the right branch of the current node; 4) Recursion steps 2 and 3 are carried out on the left branch node and the right branch node of the current node, new leaf nodes are continuously constructed until only one piece of data exists on the leaf nodes, and cutting can not be continued, or the tree grows to a preset height; 5) And integrating the isolated trees, and calculating a first abnormal value of each user.
Specifically, the flow of the difference integration moving average autoregressive algorithm is as follows: 1) Acquiring time sequence data, namely the number of the users accessing the service system per hour in the service system; 2) Observing whether the time sequence data is stable or not, and if the time sequence data is not stable, performing d-order differential division to obtain stable time sequence data; 3) Determining the optimal orders p and q by analyzing the autocorrelation coefficients and the partial autocorrelation coefficients; 4) And after the first parameter, the second parameter and the third parameter are obtained, a difference integration moving average autoregressive algorithm (p, d, q) is used for training and predicting, and a second abnormal value of each user is calculated according to a predicted value and an actual value.
In this embodiment, the isolated forest algorithm and the difference-integrated moving average autoregressive algorithm are both the prior art, and this embodiment is not described in detail herein.
In an optional embodiment, the performing, by using the preset multiple algorithms, abnormality detection on the second feature set of each user to obtain a second target abnormal value of each user includes:
performing anomaly detection on the second feature set of each user by adopting the isolated forest algorithm to obtain a third anomaly value of each user;
performing anomaly detection on the second feature set of each user by adopting the differential integration moving average autoregressive algorithm to obtain a fourth anomaly value of each user;
respectively carrying out normalization processing on the third abnormal value and the fourth abnormal value to obtain a third probability and a fourth probability of each user;
calculating the product of the third probability of each user and a third weight value corresponding to the isolated forest algorithm to obtain a third product, and calculating the product of the fourth probability of each user and a fourth weight value corresponding to the difference integration moving average autoregressive algorithm to obtain a fourth product;
and calculating the sum of the third product and the fourth product to obtain a second target abnormal value of each user.
In this embodiment, since the first abnormal value, the third abnormal value, and the second abnormal value and the fourth abnormal value obtained by the preset difference-integrated moving average autoregressive algorithm calculated by the preset isolated forest algorithm may belong to values of different dimensions, the first abnormal value, the second abnormal value, the third abnormal value, and the fourth abnormal value are normalized to the [0,1] interval, and the first abnormal value, the second abnormal value, the third abnormal value, and the fourth abnormal value are normalized to the same dimension, so that consistency among the abnormal values is ensured, and meanwhile, the first target abnormal value and the second target abnormal value of each user are calculated, and the weight ratio of each preset algorithm is taken into consideration, so that accuracy and reasonability of the first target abnormal value and the second target abnormal value of each user are improved.
In other optional embodiments, because the access of the user to the system on the weekday and the non-weekday is greatly different, a first weight value is set for a preset isolated forest algorithm in advance on the weekday, and a second weight value is set for a preset differential integration moving average autoregressive algorithm on the non-weekday; the method comprises the steps of setting a third weight value for a preset isolated forest algorithm in advance on a non-working day, and setting a fourth weight value for a preset difference integration moving average autoregressive algorithm on the non-working day, wherein the first weight value is different from the third weight value, and the second weight value is different from the fourth weight value.
In the embodiment, a plurality of preset algorithms are adopted to respectively perform anomaly detection on the first characteristic set and the second characteristic set of each user to obtain a first target anomaly value and a second target anomaly value of each user, and when the plurality of preset algorithms are adopted to perform anomaly behavior detection on the users, the characteristic sets of working days and non-working days are respectively subjected to anomaly detection in consideration of the difference between the access volumes of the working days and the non-working days, so that the accuracy of the algorithms is ensured, meanwhile, a large amount of false reports in subsequent processing are avoided, and the maintenance efficiency is improved.
In this embodiment, a plurality of preset algorithms are used to perform anomaly detection on the feature set of each user, and the feature set of each user is subjected to anomaly detection from multiple dimensions, so as to ensure the accuracy of the obtained target anomaly value of each user.
And S16, determining abnormal behavior users based on the first target abnormal value and the second target abnormal value of each user.
In this embodiment, the abnormal behavior users may include one or more users, and specifically, the target abnormal value used by each user to characterize the abnormal behavior user exceeds a preset target abnormal threshold, where the preset target abnormal threshold is set differently according to a working day and a non-working day, the preset first target abnormal threshold refers to a target abnormal threshold preset for the working day, and the preset second target abnormal threshold refers to a target abnormal threshold preset for the non-working day.
In an optional embodiment, the determining the abnormally-behaving user based on the first target abnormal value and the second target abnormal value of each user comprises:
comparing the first target abnormal value of each user with a preset first target abnormal threshold value, and comparing the second target abnormal value of each user with a preset second target abnormal threshold value;
when the first target abnormal value of each user is greater than or equal to a preset first target abnormal threshold value, and/or the second target abnormal value of each user is greater than or equal to a preset second target abnormal threshold value, determining that each user is an abnormal behavior user; or
And when the first target abnormal value of each user is smaller than the preset first target abnormal threshold value and the second target abnormal value of each user is smaller than the preset second target abnormal threshold value, determining that each user is a normal behavior user.
In this embodiment, because the difference between the access amounts of the working day and the non-working day is large, the target abnormal value also has a large difference, so that different preset target abnormal threshold values are preset for the working day and the non-working day, the accuracy of the obtained abnormal behavior user is ensured, and the efficiency of detecting the abnormal behavior of the subsequent abnormal behavior user is improved conveniently.
And S17, performing correlation analysis on the abnormal behavior user to obtain an abnormal behavior detection result of the abnormal behavior user.
In this embodiment, the association analysis refers to detecting abnormal behaviors of the abnormal behavior user on data sources other than the target data source.
In an optional embodiment, the performing the association analysis on the abnormal behavior user includes one or more of the following ways:
identifying whether the account of the abnormal behavior user is a shared account; or
Identifying whether the abnormal behavior user has VPN authority or not; or
Identifying the number of the abnormal behavior users hitting preset rules; or
Identifying whether the abnormal behavior user gives up a job; or
Identifying whether the abnormal behavior user has the authority of accessing sensitive information; or
And identifying whether the account of the abnormal behavior user is lost.
In this embodiment, whether the account of the abnormal behavior user is a shared account is analyzed; whether the abnormal behavior user prints a large number of sensitive files; whether to trigger a preset rule, namely whether to trigger other monitoring rules and the like; or identifying whether the account of the abnormal behavior user is lost or not, namely whether the account of the abnormal behavior user is executed with account sealing and other operations.
In this embodiment, the abnormal behavior user is subjected to correlation analysis from multiple dimensions in other data sources, so that the abnormal behavior detection is more accurate, and a more accurate and complete abnormal behavior detection result is obtained.
In other optional embodiments, multiple abnormal behavior users with larger target abnormal values can be obtained, and dimensions of the multiple abnormal behavior users are given, wherein more permissions exist for the abnormal behavior users with the larger target abnormal values, the number of hit preset rules is also large, the advantages of the algorithm can be better reflected, each preset algorithm is retrained by using the newly added multiple abnormal behavior users with the larger target abnormal values, and meanwhile, parameters and weight values of the algorithm are continuously modified in the algorithm training process, so that each preset algorithm tends to be perfect, and the accuracy of each preset algorithm is ensured.
In summary, the method for detecting the abnormal behavior of the user based on the artificial intelligence in the embodiment obtains the first data of each user through the interface of the target data source by determining the target data source, so that the method is more targeted, and the accuracy of the obtained first data is improved. The first data are preprocessed, then the first feature set and the second feature set of each user are extracted, the first data are preprocessed, data formats are unified, and consistency of data formats for detecting abnormal behaviors of the users is guaranteed. And the first characteristic set and the second characteristic set are respectively subjected to anomaly detection by adopting a plurality of preset algorithms, anomaly detection is performed from a plurality of dimensions, and the accuracy of the target anomaly value of each user is ensured. And determining abnormal behavior users according to the detected target abnormal values, and performing correlation analysis on the abnormal behavior users from multiple dimensions to obtain abnormal behavior detection results, so that the abnormal behavior detection is more accurate, and more accurate and complete abnormal behavior detection results are obtained.
Example two
Fig. 2 is a structural diagram of an apparatus for detecting abnormal user behavior based on artificial intelligence according to a second embodiment of the present invention.
In some embodiments, the artificial intelligence based user abnormal behavior detection apparatus 20 may include a plurality of functional modules composed of program code segments. The program code of each program segment in the artificial intelligence based user abnormal behavior detection apparatus 20 may be stored in a memory of the electronic device and executed by the at least one processor to perform (see detailed description of fig. 1) the function of artificial intelligence based user abnormal behavior detection.
In this embodiment, the user abnormal behavior detection apparatus 20 based on artificial intelligence may be divided into a plurality of functional modules according to the functions performed by the apparatus. The functional module may include: the system comprises a parsing module 201, a calling module 202, a preprocessing module 203, an extracting module 204, an anomaly detection module 205, a determining module 206 and an association analysis module 207. The module referred to herein is a series of computer readable instruction segments stored in a memory that can be executed by at least one processor and that can perform a fixed function. In the present embodiment, the functions of the modules will be described in detail in the following embodiments.
The analyzing module 201 is configured to analyze the received user abnormal behavior detection request to obtain a target data source.
In this embodiment, when detecting an abnormal user behavior, a user initiates a user abnormal behavior detection request to a server through a client, specifically, the client may be a smart phone, an IPAD, or other existing intelligent devices, the server may be a user abnormal behavior detection subsystem, and in a user abnormal behavior detection process, if the client sends the user abnormal behavior detection request to the user abnormal behavior detection subsystem, the user abnormal behavior detection subsystem is configured to receive the user abnormal behavior detection request sent by the client.
In this embodiment, when the server receives the user abnormal behavior detection request, the server parses the abnormal behavior detection request to obtain a target data source, and specifically, the target data source is used to represent a detected service scenario, that is, to detect an abnormal internal user accessing a service system corresponding to the service scenario.
And the calling module 202 is configured to call an interface of the target data source, and obtain first data of each user based on the interface.
In this embodiment, one target data source has multiple interfaces, the interfaces are used to record access footprints, and first data of a user accessing the interfaces can be acquired through the interfaces.
In an optional embodiment, the invoking module 202 invokes an interface of the target data source, and acquiring the first data of each user based on the interface includes:
identifying a service system corresponding to the target data source, and acquiring a plurality of preset interfaces from the service system;
the data of the preset interfaces are consumed and filtered in real time through the kafka to obtain a plurality of filtered data;
adopting regular matching to the plurality of filtering data, and judging whether each filtering data contains a sensitive field;
when each piece of filtering data contains a sensitive field, recording each piece of filtering data containing the sensitive field, wherein the sensitive field contains one or more sensitive fields;
identifying whether a first user ID exists in the request corresponding to each filtering data containing the sensitive field;
when a first user ID exists in the request corresponding to each filtering data containing the sensitive field, associating the same first user ID with the corresponding filtering data containing the sensitive field, and determining the associated filtering data containing the sensitive field as the first data of each user; or
And when the first user ID does not exist in the request corresponding to each filtering data containing the sensitive field, acquiring the IP of the request corresponding to each filtering data containing the sensitive field, acquiring a second user ID using the IP, associating the same second user ID with the corresponding filtering data containing the sensitive field, and determining the associated filtering data containing the sensitive field as the first data of each user.
In this embodiment, a preset interface may be preset, specifically, the preset interface is used to represent an interface related to a sensitive field, generally, a service system may include 700 interfaces related to sensitive fields, and the interfaces are generally stored in a list.
In this embodiment, data of a plurality of preset interfaces are consumed and filtered in real time by the kafka, and whether a sensitive field exists in each filtered data is judged by regular matching, for example, the sensitive field includes a mobile phone number or an order number, and the filtered data including sensitive information is recorded.
In this embodiment, since the user ID can be obtained from the recorded filtering data containing the sensitive information, the operating user in the record can be determined by recording the filtering data containing the sensitive information, which is convenient for performing subsequent user abnormal behavior detection.
In this embodiment, the rule for obtaining the user ID is as follows: according to the method, the device and the system, the IP in the request does not need to be acquired according to whether the first user ID exists in the request corresponding to each filtering data containing the sensitive field or not, and if the first user ID does not exist, the second user ID needs to be acquired according to the request IP corresponding to each filtering data containing the sensitive field, so that the problem that the second user ID of the record is not acquired and the user is missed due to the fact that the record is directly deleted when the first user ID does not exist is solved, and the integrity of the acquired user is improved.
In other embodiments, if logs of multiple service systems are mixed together, because the interface patterns of different systems are different, the data of each service system can be distinguished through the interface patterns, so that the accuracy of the acquired first data is improved.
The preprocessing module 203 is configured to preprocess the first data of each user to obtain second data of each user.
In this embodiment, because the first data may have missing data, redundant data, or data with a non-uniform format, after the first data is obtained, the first data is preprocessed, specifically, the preprocessing includes data cleaning and data format conversion on the first data.
In an optional embodiment, the preprocessing module 203 preprocesses the first data of each user, and obtaining the second data of each user includes:
and carrying out data cleaning on the first data of each user, and carrying out format conversion on the cleaned first data according to a preset format conversion rule to obtain second data of each user.
Specifically, the data cleaning comprises one or more of the following modes in combination: the method comprises the following steps of missing value cleaning, format content cleaning, logic error cleaning and non-demand data cleaning.
In this embodiment, because the data formats acquired by the plurality of preset interfaces are different, the format conversion is performed on the cleaned first data by using the preset format conversion rule, the format of the second data is unified, the consistency of the data format subsequently used for detecting the abnormal behavior of the user is ensured, and the efficiency and accuracy of detecting the abnormal behavior of the user are further improved.
An extracting module 204, configured to extract the first feature set and the second feature set of each user from the second data of each user according to a preset extraction rule.
In this embodiment, an extraction rule may be preset, specifically, the preset extraction rule is set according to an abnormal detection requirement in the user abnormal behavior detection request, for example, the preset extraction rule is to extract data of an interface of a sensitive system accessed by a user within 90 days, count an access amount per hour, and establish an access baseline for 90 days.
In an alternative embodiment, the extracting module 204 extracts the first feature set and the second feature set of each user from the second data of each user according to a preset extraction rule, including:
extracting third data in a preset time period from the second data of each user;
separating the third data according to the working days and the non-working days in the preset time period to obtain third data corresponding to the working days and third data corresponding to the non-working days;
and extracting the first feature set of each user from the third data corresponding to the working days according to a preset extraction rule, and summarizing and extracting the second feature set of each user from the third data corresponding to the non-working days according to a preset extraction rule.
In this embodiment, in data extraction, differences between visits of working days and non-working days are considered, so that data of the working days and the non-working days are separated during statistics to obtain a first feature set of each user in the working days and a second feature set of each user in the non-working days.
The anomaly detection module 205 is configured to perform anomaly detection on the first feature set of each user by using a plurality of preset algorithms to obtain a first target anomaly value of each user, and perform anomaly detection on the second feature set of each user by using a plurality of preset algorithms to obtain a second target anomaly value of each user.
In this embodiment, the feature value of each user includes the first feature set and the second feature set, and a plurality of algorithms, for example, an isolated forest algorithm and a difference-integrated moving average autoregressive algorithm, may be preset.
In this embodiment, the target abnormal value of each user includes a first target abnormal value and a second target abnormal value, where the first target abnormal value refers to an abnormal score that occurs when each user accesses the business system on a working day, and the second target abnormal value refers to an abnormal score that occurs when each user accesses the business system on a non-working day.
In an optional embodiment, the anomaly detection module 205 performs anomaly detection on the first feature set of each user by using a plurality of preset algorithms, and obtaining the first target anomaly value of each user includes:
performing anomaly detection on the first feature set of each user by adopting a preset isolated forest algorithm to obtain a first anomaly value of each user;
performing anomaly detection on the first feature set of each user by adopting a preset difference integration moving average autoregressive algorithm to obtain a second anomaly value of each user;
respectively carrying out normalization processing on the first abnormal value and the second abnormal value to obtain a first probability and a second probability of each user;
calculating the product of the first probability of each user and a first weight value corresponding to the preset isolated forest algorithm to obtain a first product, and calculating the product of the second probability of each user and a second weight value corresponding to the preset difference integration moving average autoregressive algorithm to obtain a second product;
and calculating the sum of the first product and the second product to obtain a first target abnormal value of each user.
In the embodiment, the abnormal value of each user is weighted and calculated by combining an isolated forest algorithm and a difference integration moving average autoregressive algorithm, so that the first target abnormal value of each user is obtained.
Specifically, the preset isolated forest is based on the basic principle that abnormal samples can be segmented and isolated from common samples through random features of fewer times, and the isolated forest algorithm training process comprises the following steps: 1) Randomly selecting y points from a plurality of acquired first feature sets of a plurality of users as subsamples, and putting the subsamples into a root node of a preset isolated tree; 2) Randomly appointing a dimension, and randomly generating a cutting point p in the range of the current node data, wherein the cutting point is generated between the maximum value and the minimum value of the appointed dimension in the current node data; 3) Generating a hyperplane according to the selection of the cutting point p, dividing the data space of the current node into 2 subspaces, placing the point smaller than p in the currently selected dimension on the left branch of the current node, and placing the point larger than or equal to p on the right branch of the current node; 4) Recursion steps 2 and 3 are carried out on the left branch node and the right branch node of the current node, new leaf nodes are continuously constructed until only one piece of data exists on the leaf nodes, and cutting can not be continued, or the tree grows to a preset height; 5) And integrating the isolated trees, and calculating a first abnormal value of each user.
Specifically, the flow of the difference integration moving average autoregressive algorithm is as follows: 1) Acquiring time sequence data, namely the number of the users accessing the service system per hour in the service system; 2) Observing whether the time sequence data are stable or not, and if the time sequence data are not stable, carrying out d-order differential division to obtain stable time sequence data; 3) Determining the optimal orders p and q by analyzing the autocorrelation coefficients and the partial autocorrelation coefficients; 4) And after the first parameter, the second parameter and the third parameter are obtained, a difference integration moving average autoregressive algorithm (p, d, q) is used for training and predicting, and a second abnormal value of each user is calculated according to a predicted value and an actual value.
In this embodiment, the isolated forest algorithm and the difference-integrated moving average autoregressive algorithm are both the prior art, and this embodiment is not described in detail herein.
In an optional embodiment, the anomaly detection module 205 performs anomaly detection on the second feature set of each user by using the preset multiple algorithms, and obtaining the second target anomaly value of each user includes:
performing anomaly detection on the second feature set of each user by adopting the isolated forest algorithm to obtain a third anomaly value of each user;
performing anomaly detection on the second feature set of each user by adopting the differential integration moving average autoregressive algorithm to obtain a fourth anomaly value of each user;
respectively carrying out normalization processing on the third abnormal value and the fourth abnormal value to obtain a third probability and a fourth probability of each user;
calculating the product of the third probability of each user and a third weight value corresponding to the isolated forest algorithm to obtain a third product, and calculating the product of the fourth probability of each user and a fourth weight value corresponding to the differential integration moving average autoregressive algorithm to obtain a fourth product;
and calculating the sum of the third product and the fourth product to obtain a second target abnormal value of each user.
In this embodiment, since the first abnormal value, the third abnormal value, and the second abnormal value and the fourth abnormal value obtained by the preset difference-integrated moving average autoregressive algorithm calculated by the preset isolated forest algorithm may belong to values of different dimensions, the first abnormal value, the second abnormal value, the third abnormal value, and the fourth abnormal value are normalized to the [0,1] interval, and the first abnormal value, the second abnormal value, the third abnormal value, and the fourth abnormal value are normalized to the same dimension, so that consistency among the abnormal values is ensured, and meanwhile, the first target abnormal value and the second target abnormal value of each user are calculated, and the weight ratio of each preset algorithm is taken into consideration, so that accuracy and reasonability of the first target abnormal value and the second target abnormal value of each user are improved.
In other optional embodiments, because the access of the user to the system on the weekday and the non-weekday is greatly different, a first weight value is set for a preset isolated forest algorithm in advance on the weekday, and a second weight value is set for a preset differential integration moving average autoregressive algorithm on the non-weekday; the method comprises the steps of setting a third weight value for a preset isolated forest algorithm in advance on a non-working day, and setting a fourth weight value for a preset difference integration moving average autoregressive algorithm on the non-working day, wherein the first weight value is different from the third weight value, and the second weight value is different from the fourth weight value.
In the embodiment, a plurality of preset algorithms are adopted to respectively perform anomaly detection on the first characteristic set and the second characteristic set of each user to obtain a first target anomaly value and a second target anomaly value of each user, and when the plurality of preset algorithms are adopted to perform anomaly behavior detection on the users, the characteristic sets of working days and non-working days are respectively subjected to anomaly detection in consideration of the difference between the access volumes of the working days and the non-working days, so that the accuracy of the algorithms is ensured, meanwhile, a large amount of false reports in subsequent processing are avoided, and the maintenance efficiency is improved.
In this embodiment, a plurality of preset algorithms are used to perform anomaly detection on the feature set of each user, and the feature set of each user is subjected to anomaly detection from multiple dimensions, so as to ensure the accuracy of the obtained target anomaly value of each user.
A determining module 206, configured to determine an abnormal behavior user based on the first target abnormal value and the second target abnormal value of each user.
In this embodiment, the abnormal behavior users may include one or more users, and specifically, the target abnormal value used by each user to characterize the abnormal behavior user exceeds a preset target abnormal threshold, where the preset target abnormal threshold is set differently according to a working day and a non-working day, the preset first target abnormal threshold refers to a target abnormal threshold preset for the working day, and the preset second target abnormal threshold refers to a target abnormal threshold preset for the non-working day.
In an alternative embodiment, the determining module 206 determines the abnormally behaving user based on the first target outlier and the second target outlier of each user includes:
comparing the first target abnormal value of each user with a preset first target abnormal threshold value, and comparing the second target abnormal value of each user with a preset second target abnormal threshold value;
when the first target abnormal value of each user is greater than or equal to a preset first target abnormal threshold value, and/or the second target abnormal value of each user is greater than or equal to a preset second target abnormal threshold value, determining that each user is an abnormal behavior user; or
And when the first target abnormal value of each user is smaller than the preset first target abnormal threshold value and the second target abnormal value of each user is smaller than the preset second target abnormal threshold value, determining that each user is a normal behavior user.
In this embodiment, because the difference between the access amounts of the working day and the non-working day is large, the target abnormal value also has a large difference, so that different preset target abnormal threshold values are preset for the working day and the non-working day, the accuracy of the obtained abnormal behavior user is ensured, and the efficiency of detecting the abnormal behavior of the subsequent abnormal behavior user is improved conveniently.
And the association analysis module 207 is configured to perform association analysis on the abnormal behavior user to obtain an abnormal behavior detection result of the abnormal behavior user.
In this embodiment, the association analysis refers to detecting abnormal behaviors of the abnormal behavior user on data sources other than the target data source.
In an optional embodiment, the association analysis module 207 performs association analysis on the abnormal behavior user by one or more of the following ways:
identifying whether the account of the abnormal behavior user is a shared account; or
Identifying whether the abnormal behavior user has VPN authority or not; or
Identifying the number of the abnormal behavior users hitting preset rules; or
Identifying whether the abnormal behavior user gives up a job; or
Identifying whether the abnormal behavior user has the authority of accessing sensitive information; or
And identifying whether the account of the abnormal behavior user is lost.
In this embodiment, whether the account of the abnormal behavior user is a shared account is analyzed; whether the abnormal behavior user prints a large number of sensitive files; whether to trigger a preset rule, namely whether to trigger other monitoring rules and the like; or identifying whether the account of the abnormal behavior user is lost or not, namely whether the account of the abnormal behavior user is executed with account sealing and other operations.
In this embodiment, the abnormal behavior user is subjected to correlation analysis from multiple dimensions in other data sources, so that the abnormal behavior detection is more accurate, and a more accurate and complete abnormal behavior detection result is obtained.
In other optional embodiments, multiple abnormal behavior users with a large target abnormal value may be obtained, and dimensions of the multiple abnormal behavior users are given, wherein more permissions exist for the abnormal behavior users with the large target abnormal value, the number of hit preset rules is large, and the advantages of the algorithm can be better embodied.
In summary, the user abnormal behavior detection apparatus based on artificial intelligence described in this embodiment obtains the first data of each user through the interface of the target data source by determining the target data source, so that the apparatus is more targeted, and the accuracy of the obtained first data is improved. The first data are preprocessed, then the first feature set and the second feature set of each user are extracted, the first data are preprocessed, data formats are unified, and consistency of data formats for detecting abnormal behaviors of the users is guaranteed. And the first characteristic set and the second characteristic set are respectively subjected to anomaly detection by adopting a plurality of preset algorithms, anomaly detection is performed from a plurality of dimensions, and the accuracy of the target anomaly value of each user is ensured. And determining abnormal behavior users according to the detected target abnormal values, and performing correlation analysis on the abnormal behavior users from multiple dimensions to obtain abnormal behavior detection results, so that the abnormal behavior detection is more accurate, and more accurate and complete abnormal behavior detection results are obtained.
EXAMPLE III
Fig. 3 is a schematic structural diagram of an electronic device according to a third embodiment of the present invention. In the preferred embodiment of the present invention, the electronic device 3 comprises a memory 31, at least one processor 32, at least one communication bus 33 and a transceiver 34.
It will be appreciated by those skilled in the art that the configuration of the electronic device shown in fig. 3 does not constitute a limitation of the embodiment of the present invention, and may be a bus-type configuration or a star-type configuration, and the electronic device 3 may include more or less other hardware or software than those shown, or a different arrangement of components.
In some embodiments, the electronic device 3 is an electronic device capable of automatically performing numerical calculation and/or information processing according to instructions set or stored in advance, and the hardware thereof includes but is not limited to a microprocessor, an application specific integrated circuit, a programmable gate array, a digital processor, an embedded device, and the like. The electronic device 3 may also include a client device, which includes, but is not limited to, any electronic product that can interact with a client through a keyboard, a mouse, a remote controller, a touch pad, or a voice control device, for example, a personal computer, a tablet computer, a smart phone, a digital camera, and the like.
It should be noted that the electronic device 3 is only an example, and other existing or future electronic products, such as those that can be adapted to the present invention, should also be included in the scope of the present invention, and are included herein by reference.
In some embodiments, the memory 31 is used for storing program codes and various data, such as the artificial intelligence based abnormal user behavior detection device 20 installed in the electronic device 3, and realizes high-speed and automatic access to programs or data during the operation of the electronic device 3. The Memory 31 includes a Read-Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), a One-time Programmable Read-Only Memory (OTPROM), an electronically Erasable Programmable Read-Only Memory (Electrically-Erasable Programmable Read-Only Memory (EEPROM)), an optical Read-Only disk (CD-ROM) or other optical disk Memory, a magnetic disk Memory, a tape Memory, or any other medium capable of being Read by a computer for carrying or storing data.
In some embodiments, the at least one processor 32 may be composed of an integrated circuit, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same function or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The at least one processor 32 is a Control Unit (Control Unit) of the electronic device 3, connects various components of the whole electronic device 3 by using various interfaces and lines, and executes various functions of the electronic device 3 and processes data by running or executing programs or modules stored in the memory 31 and calling data stored in the memory 31.
In some embodiments, the at least one communication bus 33 is arranged to enable connection communication between the memory 31 and the at least one processor 32 or the like.
Although not shown, the electronic device 3 may further include a power supply (such as a battery) for supplying power to each component, and optionally, the power supply may be logically connected to the at least one processor 32 through a power management device, so as to implement functions of managing charging, discharging, and power consumption through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 3 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
It is to be understood that the embodiments described are illustrative only and are not to be construed as limiting the scope of the claims.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, an electronic device, or a network device) or a processor (processor) to execute parts of the methods according to the embodiments of the present invention.
In a further embodiment, in conjunction with fig. 2, the at least one processor 32 may execute operating means of the electronic device 3 and various installed applications (such as the artificial intelligence based user abnormal behavior detection apparatus 20), program code, and the like, for example, the above-mentioned modules.
The memory 31 has program code stored therein, and the at least one processor 32 can call the program code stored in the memory 31 to perform related functions. For example, the modules illustrated in fig. 2 are program codes stored in the memory 31 and executed by the at least one processor 32, so as to implement the functions of the modules for the purpose of detecting abnormal behavior of the user based on artificial intelligence.
Illustratively, the program code may be partitioned into one or more modules/units that are stored in the memory 31 and executed by the processor 32 to accomplish the present application. The one or more modules/units may be a series of computer readable instruction segments capable of performing certain functions, which are used for describing the execution process of the program code in the electronic device 3. For example, the program code may be partitioned into parsing module 201, calling module 202, preprocessing module 203, extracting module 204, anomaly detection module 205, determining module 206, and association analysis module 207.
In one embodiment of the present invention, the memory 31 stores a plurality of computer-readable instructions that are executed by the at least one processor 32 to implement the functionality of artificial intelligence based user anomalous behavior detection.
Specifically, the at least one processor 32 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1, and details are not repeated here.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or that the singular does not exclude the plural. A plurality of units or means recited in the present invention may also be implemented by one unit or means through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. An artificial intelligence based user abnormal behavior detection method is characterized by comprising the following steps:
analyzing the received user abnormal behavior detection request to obtain a target data source;
calling an interface of the target data source, and acquiring first data of each user based on the interface;
preprocessing the first data of each user to obtain second data of each user;
extracting the first feature set and the second feature set of each user from the second data of each user according to a preset extraction rule, wherein the extraction rule comprises the following steps: extracting a first feature set of each user from third data corresponding to a working day, and extracting a second feature set of each user from third data corresponding to a non-working day;
performing anomaly detection on the first feature set of each user by adopting a plurality of preset algorithms to obtain a first target anomaly value of each user, wherein the method comprises the following steps: performing anomaly detection on the first feature set of each user by adopting a preset isolated forest algorithm to obtain a first anomaly value of each user; performing anomaly detection on the first feature set of each user by adopting a preset difference integration moving average autoregressive algorithm to obtain a second anomaly value of each user; performing weighted calculation on the first abnormal value and the second abnormal value to obtain a first target abnormal value of each user;
performing anomaly detection on the second feature set of each user by adopting the preset multiple algorithms to obtain a second target anomaly value of each user, wherein the method comprises the following steps: performing anomaly detection on the second feature set of each user by adopting the isolated forest algorithm to obtain a third anomaly value of each user; performing anomaly detection on the second feature set of each user by adopting the difference integration moving average autoregressive algorithm to obtain a fourth anomaly value of each user; performing weighted calculation on the third abnormal value and the fourth abnormal value to obtain a second target abnormal value of each user;
determining abnormal behavior users based on the first target abnormal value and the second target abnormal value of each user;
and performing correlation analysis on the abnormal behavior user to obtain an abnormal behavior detection result of the abnormal behavior user.
2. The artificial intelligence based user anomalous behavior detection method of claim 1 wherein said invoking an interface of said target data source, said obtaining first data for each user based on said interface comprises:
identifying a service system corresponding to the target data source, and acquiring a plurality of preset interfaces from the service system;
the data of the preset interfaces are consumed and filtered in real time through the kafka to obtain a plurality of filtered data;
adopting regular matching to the plurality of filtering data, and judging whether each filtering data contains a sensitive field;
when each piece of filtering data contains a sensitive field, recording each piece of filtering data containing the sensitive field, wherein the sensitive field contains one or more sensitive fields;
identifying whether a first user ID exists in the request corresponding to each filtering data containing the sensitive field;
when a first user ID exists in the request corresponding to each filtering data containing the sensitive field, associating the same first user ID with the corresponding filtering data containing the sensitive field, and determining the associated filtering data containing the sensitive field as the first data of each user; or
And when the first user ID does not exist in the request corresponding to each filtering data containing the sensitive field, acquiring the IP of the request corresponding to each filtering data containing the sensitive field, acquiring a second user ID using the IP, associating the same second user ID with the corresponding filtering data containing the sensitive field, and determining the associated filtering data containing the sensitive field as the first data of each user.
3. The artificial intelligence based user abnormal behavior detection method according to claim 1, wherein the extracting the first feature set and the second feature set of each user from the second data of each user according to the preset extraction rule comprises:
extracting third data in a preset time period from the second data of each user;
separating the third data according to the working days and the non-working days in the preset time period to obtain third data corresponding to the working days and third data corresponding to the non-working days;
and extracting the first feature set of each user from the third data corresponding to the working days according to a preset extraction rule, and summarizing and extracting the second feature set of each user from the third data corresponding to the non-working days according to a preset extraction rule.
4. The artificial intelligence based user abnormal behavior detection method of claim 1, wherein the performing a weighted calculation on the first and second outliers to obtain a first target outlier for each user comprises:
respectively carrying out normalization processing on the first abnormal value and the second abnormal value to obtain a first probability and a second probability of each user;
calculating the product of the first probability of each user and a first weight value corresponding to the preset isolated forest algorithm to obtain a first product, and calculating the product of the second probability of each user and a second weight value corresponding to the preset difference integration moving average autoregressive algorithm to obtain a second product;
and calculating the sum of the first product and the second product to obtain a first target abnormal value of each user.
5. The artificial intelligence-based user abnormal behavior detection method of claim 4, wherein the performing a weighted calculation on the third abnormal value and the fourth abnormal value to obtain a second target abnormal value for each user comprises:
respectively carrying out normalization processing on the third abnormal value and the fourth abnormal value to obtain a third probability and a fourth probability of each user;
calculating the product of the third probability of each user and a third weight value corresponding to the isolated forest algorithm to obtain a third product, and calculating the product of the fourth probability of each user and a fourth weight value corresponding to the difference integration moving average autoregressive algorithm to obtain a fourth product;
and calculating the sum of the third product and the fourth product to obtain a second target abnormal value of each user.
6. The artificial intelligence based user abnormal behavior detection method of claim 1, wherein said determining abnormal behavior users based on said first target abnormal value and said second target abnormal value of each user comprises:
comparing the first target abnormal value of each user with a preset first target abnormal threshold value, and comparing the second target abnormal value of each user with a preset second target abnormal threshold value;
when the first target abnormal value of each user is greater than or equal to a preset first target abnormal threshold value, and/or the second target abnormal value of each user is greater than or equal to a preset second target abnormal threshold value, determining that each user is an abnormal behavior user; or
And when the first target abnormal value of each user is smaller than the preset first target abnormal threshold value and the second target abnormal value of each user is smaller than the preset second target abnormal threshold value, determining that each user is a normal behavior user.
7. The artificial intelligence based user abnormal behavior detection method according to claim 1, wherein the performing of the correlation analysis on the abnormal behavior user comprises one or more of the following ways:
identifying whether the account of the abnormal behavior user is a shared account; or
Identifying whether the abnormal behavior user has VPN authority or not; or
Identifying the number of the abnormal behavior users hitting preset rules; or
Identifying whether the abnormal behavior user gives up a job; or
Identifying whether the abnormal behavior user has the authority of accessing sensitive information; or
And identifying whether the account of the abnormal behavior user is lost.
8. An artificial intelligence based user abnormal behavior detection apparatus, the apparatus comprising:
the analysis module is used for analyzing the received user abnormal behavior detection request to obtain a target data source;
the calling module is used for calling an interface of the target data source and acquiring first data of each user based on the interface;
the preprocessing module is used for preprocessing the first data of each user to obtain second data of each user;
an extracting module, configured to extract the first feature set and the second feature set of each user from the second data of each user according to a preset extraction rule, where the extracting module includes: extracting a first feature set of each user from third data corresponding to a working day, and extracting a second feature set of each user from third data corresponding to a non-working day;
the anomaly detection module is used for performing anomaly detection on the first feature set of each user by adopting a plurality of preset algorithms to obtain a first target anomaly value of each user, and comprises: performing anomaly detection on the first feature set of each user by adopting a preset isolated forest algorithm to obtain a first anomaly value of each user; performing anomaly detection on the first feature set of each user by adopting a preset difference integration moving average autoregressive algorithm to obtain a second anomaly value of each user; performing weighted calculation on the first abnormal value and the second abnormal value to obtain a first target abnormal value of each user, and performing abnormal detection on the second feature set of each user by adopting a plurality of preset algorithms to obtain a second target abnormal value of each user, wherein the steps of: performing anomaly detection on the second feature set of each user by adopting the isolated forest algorithm to obtain a third anomaly value of each user; performing anomaly detection on the second feature set of each user by adopting the differential integration moving average autoregressive algorithm to obtain a fourth anomaly value of each user; performing weighted calculation on the third abnormal value and the fourth abnormal value to obtain a second target abnormal value of each user;
the determining module is used for determining abnormal behavior users based on the first target abnormal value and the second target abnormal value of each user;
and the association analysis module is used for performing association analysis on the abnormal behavior user to obtain an abnormal behavior detection result of the abnormal behavior user.
9. An electronic device, characterized in that the electronic device comprises a processor and a memory, the processor being configured to implement the artificial intelligence based user abnormal behavior detection method according to any one of claims 1 to 7 when executing a computer program stored in the memory.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the artificial intelligence based user abnormal behavior detection method according to any one of claims 1 to 7.
CN202111185035.3A 2021-10-12 2021-10-12 User abnormal behavior detection method and device based on artificial intelligence and related equipment Active CN113949652B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111185035.3A CN113949652B (en) 2021-10-12 2021-10-12 User abnormal behavior detection method and device based on artificial intelligence and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111185035.3A CN113949652B (en) 2021-10-12 2021-10-12 User abnormal behavior detection method and device based on artificial intelligence and related equipment

Publications (2)

Publication Number Publication Date
CN113949652A CN113949652A (en) 2022-01-18
CN113949652B true CN113949652B (en) 2023-03-21

Family

ID=79330511

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111185035.3A Active CN113949652B (en) 2021-10-12 2021-10-12 User abnormal behavior detection method and device based on artificial intelligence and related equipment

Country Status (1)

Country Link
CN (1) CN113949652B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114979369A (en) * 2022-04-14 2022-08-30 马上消费金融股份有限公司 Abnormal call detection method and device, electronic equipment and storage medium
CN115221011A (en) * 2022-09-21 2022-10-21 中国电子信息产业集团有限公司 Data element circulation call abnormity monitoring method and device
CN117692196A (en) * 2023-12-11 2024-03-12 国网河南省电力公司经济技术研究院 User state portrait anomaly monitoring method based on random forest

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420073A (en) * 2021-08-23 2021-09-21 平安科技(深圳)有限公司 Abnormal sample detection method based on improved isolated forest and related equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108881326A (en) * 2018-09-27 2018-11-23 深圳市联软科技股份有限公司 Determine method, system, medium and the equipment of exception of network traffic behavior
CN110519241A (en) * 2019-08-12 2019-11-29 广州海颐信息安全技术有限公司 The method and device for actively discovering privilege and threatening abnormal behaviour based on machine learning
CN110807488B (en) * 2019-11-01 2022-03-08 北京芯盾时代科技有限公司 Anomaly detection method and device based on user peer-to-peer group

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420073A (en) * 2021-08-23 2021-09-21 平安科技(深圳)有限公司 Abnormal sample detection method based on improved isolated forest and related equipment

Also Published As

Publication number Publication date
CN113949652A (en) 2022-01-18

Similar Documents

Publication Publication Date Title
CN113949652B (en) User abnormal behavior detection method and device based on artificial intelligence and related equipment
CN111177714B (en) Abnormal behavior detection method and device, computer equipment and storage medium
US20190228296A1 (en) Significant events identifier for outlier root cause investigation
CN111835582B (en) Configuration method and device of Internet of things inspection equipment and computer equipment
KR102522005B1 (en) Apparatus for VNF Anomaly Detection based on Machine Learning for Virtual Network Management and a method thereof
CN111694718A (en) Method and device for identifying abnormal behavior of intranet user, computer equipment and readable storage medium
CN111045894A (en) Database anomaly detection method and device, computer equipment and storage medium
CN104615936B (en) Cloud platform VMM layer behavior monitoring method
CN103069749A (en) Isolation of problems in a virtual environment
CN105184886A (en) Cloud data center intelligence inspection system and cloud data center intelligence inspection method
CN115865649B (en) Intelligent operation and maintenance management control method, system and storage medium
Fu et al. Performance issue diagnosis for online service systems
CN112612680A (en) Message warning method, system, computer equipment and storage medium
CN116450482A (en) User abnormality monitoring method and device, electronic equipment and storage medium
CN112100239A (en) Portrait generation method and apparatus for vehicle detection device, server and readable storage medium
KR102410151B1 (en) Method, apparatus and computer-readable medium for machine learning based observation level measurement using server system log and risk calculation using thereof
CN113835918A (en) Server fault analysis method and device
CN114598719A (en) Smart city Internet of things event management method, device and readable medium
CN116582339A (en) Intelligent building network security monitoring method and monitoring system
CN115422538A (en) Application risk identification method, device and equipment
CN114610980A (en) Network public opinion based black product identification method, device, equipment and storage medium
CN114416417A (en) System abnormity monitoring method, device, equipment and storage medium
CN115277472A (en) Network security risk early warning system and method for multidimensional industrial control system
CN114662095A (en) Safety monitoring method, device and equipment based on operation data and storage medium
CN111475380A (en) Log analysis method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant