CN112488175B - Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium - Google Patents

Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium Download PDF

Info

Publication number
CN112488175B
CN112488175B CN202011347823.3A CN202011347823A CN112488175B CN 112488175 B CN112488175 B CN 112488175B CN 202011347823 A CN202011347823 A CN 202011347823A CN 112488175 B CN112488175 B CN 112488175B
Authority
CN
China
Prior art keywords
behavior
user
users
matrix
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011347823.3A
Other languages
Chinese (zh)
Other versions
CN112488175A (en
Inventor
李兴国
邹斯达
苗功勋
路冰
孙宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT CO LTD
Nanjing Zhongfu Information Technology Co Ltd
Zhongfu Information Co Ltd
Zhongfu Safety Technology Co Ltd
Original Assignee
BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT CO LTD
Nanjing Zhongfu Information Technology Co Ltd
Zhongfu Information Co Ltd
Zhongfu Safety Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT CO LTD, Nanjing Zhongfu Information Technology Co Ltd, Zhongfu Information Co Ltd, Zhongfu Safety Technology Co Ltd filed Critical BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT CO LTD
Priority to CN202011347823.3A priority Critical patent/CN112488175B/en
Publication of CN112488175A publication Critical patent/CN112488175A/en
Application granted granted Critical
Publication of CN112488175B publication Critical patent/CN112488175B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides an abnormal user detection method, a terminal and a storage medium based on behavior aggregation characteristics, and user behavior information in a preset time period is obtained; aggregating the characteristic attributes in a preset time period of the user based on the access address information; configuring a matrix of each user into a row of vectors; calculating correlation coefficients between any two users respectively to serve as behavior similarity; searching two users with the maximum similarity, and gathering the two users into one type; calculating the similarity between the class and other users, updating and gathering a similarity matrix of the class of users, and repeating iterative calculation; and after the iterative calculation reaches a preset threshold, stopping the clustering process, and determining that the user who is separated from the intranet group has abnormal behavior. Thus, the invention reduces the false alarm rate of the anomaly detection. The abnormal use hidden in the group can be identified, and the safety of the data information is ensured.

Description

Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method for detecting an abnormal user based on behavior aggregation features, a terminal device, and a storage medium.
Background
With the rapid development of computer network technology, data information has become a currently important carrier. The data information carries information of the business, information of the user, information of the transaction, and information of the communication. Data information plays a very important role for everyone and every enterprise.
The network architecture based on the TCP/IP protocol covers every corner of the world, brings great convenience to the life of people, but is accompanied by increasingly serious information security problems. Various important institutions, large enterprises and the like resist the leakage risk of data by means of arranging a fireproof wall, a network intrusion detection system, antivirus software and the like on an inner network frame, but single-point detection is often limited to a small part of rules, cannot cope with the foreseeing eavesdropping behavior, and the safety of data information cannot be ensured.
Disclosure of Invention
In order to overcome the defects in the prior art and improve the accuracy of detecting the abnormal behaviors inside the group in the intranet, the invention provides an abnormal user detection method based on behavior aggregation characteristics, which comprises the following steps:
acquiring user behavior information in a preset time period;
aggregating the characteristic attributes in a preset time period of the user based on the access address information;
based on the original behavior characteristics, performing adjacent element column transformation to obtain a behavior matrix with the size being the number of the target servers, and configuring the matrix of each user into a row of vectors;
calculating correlation coefficients between any two users respectively to serve as behavior similarity of the two users in the intranet group in a period of time;
searching two users with the maximum similarity according to the similarity matrix among the users, and gathering the two users into a class;
calculating the similarity between the class and other users, updating and gathering a similarity matrix of the class of users, and repeating iterative calculation;
and after the iterative calculation reaches a preset threshold, stopping the clustering process, and determining that the user who is separated from the intranet group has abnormal behavior.
It should be further noted that, extracting a flow five-tuple used by a user from a detector of the communication network, and intercepting flow data accessed by the user;
obtaining target IP aggregation behavior characteristics according to user access in a specified time window granularity;
a behavioral characteristics matrix is generated based on each source IP of the behavioral characteristics.
It should be further noted that, the step of transforming adjacent element columns includes:
cross multiplying the adjacent elements of the characteristic dimension of the user IP1 and the user IP2 to splice a new behavior characteristic matrix:
Figure BDA0002800457770000021
wherein m and n are both the intervals of [1, num_distip ];
m and n are used to represent the subscripts of the elements, respectively, the column index of the matrix;
i represents a row index of the behavior matrix, and is a positive integer not exceeding 6;
p represents an element in the original behavior feature matrix.
It should be further noted that the step of calculating the correlation coefficient between any two users includes;
similarity between two user behavior vectors, configuring a user behavior feature matrix into a row of vectors, executing similarity calculation, and calculating by using the following formula:
Figure BDA0002800457770000022
x and Y are behavior vectors of two users;
n represents the length of the vector;
EX, EY represents the mean of the two vectors.
It should be further noted that, before the step of obtaining the user behavior information within the preset time period, the method further includes:
and acquiring behavior information of accessing the target server in a preset time period based on the intranet user.
It should be further noted that, the behavior information of the access target server includes: the intranet user accesses the flow information, time information and the information of the application protocol generated by each target server.
It should be further noted that, in a preset time period, counting the target servers connected with the intranet users, and forming a target server IP list;
and counting the sum of uplink and downlink flow generated respectively and the total frequency of connection with each target server when the intranet user is connected with each target server.
Further, in a preset time period, counting the total time length of each connection of the intranet user to the target server and the time length of the last connection time between the intranet user and the target server and the current time, and configuring the activity;
and counting the number of application layer protocol types used by the intranet user and the target server in a connection manner within a preset time period.
The invention also provides a terminal device for realizing the abnormal user detection method based on the behavior aggregation characteristics, which is characterized by comprising the following steps:
a memory for storing a computer program and an abnormal user detection method based on behavior aggregation characteristics;
and the processor is used for executing the computer program and the abnormal user detection method based on the behavior aggregation characteristics so as to realize the steps of the abnormal user detection method based on the behavior aggregation characteristics.
The invention also provides a readable storage medium having an abnormal user detection method based on behavior aggregation features, characterized in that the readable storage medium has a computer program stored thereon, the computer program being executed by a processor to implement the steps of the abnormal user detection method based on behavior aggregation features.
From the above technical scheme, the invention has the following advantages:
the abnormal user detection method based on the behavior aggregation feature can extract the rule shared by high frequency as the behavior extraction feature, realize multiplexing, such as time, protocol, flow bytes, connection frequency and the like, has high expansion degree based on time and high execution efficiency,
the method can be applied to the scene of anomaly detection in the intranet group. Outlier detection based on similarity can help enterprises or other organization security analysts track clues of secret loss in a traceable intranet, capture inconspicuous abnormal behaviors, and can treat security events as soon as possible, so that loss is minimized.
In a real scene, the invention can meet the requirement of an analyst on the problem of unbalanced behavior distribution caused by working habit or task dispatch of the user, overcomes the problem that adjacent access node information cannot be associated, and greatly reduces the false alarm rate of anomaly detection. Abnormal users hidden in the group can be identified, and the safety of the data information is ensured.
Drawings
In order to more clearly illustrate the technical solutions of the present invention, the drawings that are needed in the description will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for detecting abnormal users based on behavior aggregation features;
FIG. 2 is an exemplary diagram of an abnormal user detection methodology architecture based on behavior aggregation features;
FIG. 3 is a flowchart of an embodiment of a method for detecting abnormal users based on behavior aggregation features.
Detailed Description
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The invention provides an abnormal user detection method based on behavior aggregation characteristics, which is shown in figure 1, in order to improve the accuracy of detecting abnormal users in a group in an intranet. The method is performed based on a terminal device and a readable storage medium. Reading the related information according to a preset method, and counting the related data of the user in the time; fully mining association information between adjacent target IPs; that is, calculation of the similarity, based on the processor running, by running the code disposed in the readable storage device, the calculation of the similarity is performed; the anomaly detection module is used for finding potential outlier users by using a systematic clustering method based on the calculation result of the last step.
The user to which the present invention relates is an intranet user. Intranet users refer to networks established in local area networks, or in unit office networks. Of course, the intranet may also be connected to the extranet. Of course, the optical fiber may be dedicated to a single building, a community, an office area, or the like, and the local area may be a intranet based on ethernet technology.
The system architecture 100 in an intranet may include one or more of the terminal devices 101, 102, 103, a network 104, and a server 105. The communication network 104 is a medium to provide a communication link between the terminal devices 101, 102, 103 and the server 105. The communication network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
It should be understood that the number of terminal devices, networks and servers in fig. 2 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, the server 105 may be a server cluster formed by a plurality of servers.
It should be noted that, the computer system 200 of the electronic device shown in fig. 2 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present disclosure.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. The terminal devices 101, 102, 103 may be a variety of electronic devices with display screens including, but not limited to, smartphones, tablet computers, laptop and desktop computers, digital cinema projectors, and the like.
The server 105 may be a server providing various services. For example, the user transmits data information to the server 105 using the terminal device 103 (may be the terminal device 101 or 102).
The method specifically comprises the following steps:
s101, acquiring user behavior information in a preset time period;
the preset time period may be set based on the actual use environment and the use requirement, and may be set in hours, days, weeks, or months. The user behavior information is based on the access of the terminal device to the target server and the data manipulation.
S102, aggregating characteristic attributes in a preset time period of a user based on access address information;
the access address information includes: access source IP and access destination IP.
S103, based on original behavior characteristics, performing adjacent element column transformation to obtain a behavior matrix with the size being the number of target servers, and configuring the matrix of each user into a row of vectors;
s104, calculating correlation coefficients between any two users respectively, and taking the correlation coefficients as the behavior similarity of the two users in the intranet group in a period of time; the larger the calculated coefficient, the higher the similarity.
S105, searching two users with the maximum similarity according to a similarity matrix among the users, and gathering the two users into a class;
s106, calculating the similarity between the class and other users, updating a similarity matrix gathered into a class of users, and repeating iterative calculation;
and S107, stopping the clustering process after the iterative computation reaches a preset threshold, and determining that the user who is separated from the intranet group has abnormal behavior.
Based on the above method steps, it can be seen from fig. 1 that because of the characteristics of the traffic data, the rule shared by high frequencies can be extracted as a behavior extraction feature, multiplexing is realized, such as time, protocol, traffic bytes, and connection frequency, etc., the expansion degree based on time is high, and the execution efficiency is high,
the method can be applied to the scene of anomaly detection in the intranet group. Outlier detection based on similarity can help enterprises or other organization security analysts track clues of secret loss in a traceable intranet, capture inconspicuous abnormal behaviors, and can treat security events as soon as possible, so that loss is minimized.
In a real scene, the invention can meet the requirement of an analyst on the problem of unbalanced behavior distribution caused by working habit or task dispatch of the user, overcomes the problem that adjacent access node information cannot be associated, and greatly reduces the false alarm rate of anomaly detection. Abnormal users hidden in the group can be identified, and the safety of the data information is ensured.
The flowcharts and block diagrams in the figures of the behavior-aggregation-feature-based abnormal user detection method provided by the present invention illustrate the architecture, functionality, and operation of possible implementations of methods, apparatuses, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The embodiment of the device described in the method for detecting abnormal users based on behavior aggregation features is merely illustrative, for example, the division of the units is merely a logic function division, and other division modes may be available in actual implementation, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted or not implemented. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or elements, or may be an electrical, mechanical, or other form of connection.
FIG. 3 schematically illustrates a flow chart of another method of abnormal user detection based on behavior aggregation features, according to an embodiment of the present disclosure. The method steps of the present embodiment may be performed by the terminal device, or may be performed by the server, or may be performed by the terminal device and the server interactively, for example, may be performed by the server 105 in fig. 2, which is described above, but the disclosure is not limited thereto.
S201, extracting flow quintuples used by a user from a detector of a communication network, and intercepting flow data accessed by the user;
the flow quintuple comprises an intranet user IP, a port, a target server IP and a port.
Obtaining target IP aggregation behavior characteristics according to user access in a specified time window granularity;
a behavioral characteristics matrix is generated based on each source IP of the behavioral characteristics.
S202, based on intranet users, behavior information of accessing a target server in a preset time period is obtained.
S203, the behavior information of the access target server includes: the intranet user accesses the flow information, time information and the information of the application protocol generated by each target server.
S204, counting the target servers connected with the intranet users in a preset time period, and forming a target server IP list;
s205, counting the sum of uplink and downlink flow generated respectively and the total frequency of connection with each target server when the intranet user is connected with each target server.
S206, counting the total time length of each connection of the intranet user to the target server and the time length of the last connection time between the intranet user and the target server and the current time within a preset time period, and configuring the activity;
s207, counting the number of application layer protocol types used by the intranet user and the target server in a connection manner within a preset time period.
S208, obtaining user behavior information in a preset time period;
s209, aggregating the characteristic attributes in a preset time period of the user based on the access address information;
s210, based on original behavior characteristics, performing adjacent element column transformation to obtain a behavior matrix with the size being the number of target servers, and configuring the matrix of each user into a row of vectors;
s211, calculating correlation coefficients between any two users respectively, wherein the correlation coefficients are used as the behavior similarity of the two users in the intranet group in a period of time; the larger the calculated coefficient, the higher the similarity.
The user behavior feature matrix is shown in the following table 1, and is a user behavior feature matrix after aggregation, and the construction process is as follows:
TABLE 1
Figure BDA0002800457770000071
Figure BDA0002800457770000081
1) Acquiring flow data accessed by a user from a flow quintuple of a detector;
2) Obtaining target IP aggregation behavior characteristics according to user access in a specified time window granularity;
3) A behavior feature matrix is generated for each source IP of the behavior. For coarse-grained time windows, all behavior information is directly aggregated, and only one feature matrix is generated per source IP.
The adjacent element column transformation for the steps in the method includes:
cross multiplying the adjacent elements of the characteristic dimension of the user IP1 and the user IP2 to splice a new behavior characteristic matrix:
Figure BDA0002800457770000082
wherein m and n are both the intervals of [1, num_distip ];
m and n are used to represent the subscripts of the elements, respectively, the column index of the matrix;
i represents a row index of the behavior matrix, and is a positive integer not exceeding 6;
p represents an element in the original behavior feature matrix.
Calculating a correlation coefficient between any two users includes;
similarity between two user behavior vectors, configuring a user behavior feature matrix into a row of vectors, executing similarity calculation, and calculating by using the following formula:
Figure BDA0002800457770000083
x and Y are behavior vectors of two users;
n represents the length of the vector;
EX, EY represents the mean of the two vectors. The correlation coefficient is a cosine similarity after centering, can represent whether the directions of vectors in space are similar or not, and also processes the different dimensions of the components.
S212, searching two users with the maximum similarity according to a similarity matrix among the users, and gathering the two users into a class;
s213, calculating the similarity between the class and other users, updating a similarity matrix gathered into a class of users, and repeating iterative calculation;
s214, after the iterative computation is repeated to reach a preset threshold, stopping the clustering process, and at the moment, the user who is separated from the intranet group is identified as having abnormal behaviors.
The method can be applied to the scene of anomaly detection in the intranet group. Outlier detection based on similarity can help enterprises or other organization security analysts track clues of secret loss in a traceable intranet, capture inconspicuous abnormal behaviors, and can treat security events as soon as possible, so that loss is minimized.
In a real scene, the invention can meet the requirement of an analyst on the problem of unbalanced behavior distribution caused by working habit or task dispatch of the user, overcomes the problem that adjacent access node information cannot be associated, and greatly reduces the false alarm rate of anomaly detection. Abnormal users hidden in the group can be identified, and the safety of the data information is ensured.
The present invention also provides an embodiment, where the method steps of the embodiment may be performed by a terminal device, or may be performed by a server, or may be performed by the terminal device and the server interactively, for example, and may be performed by the server 105 in fig. 2, which is described above, but the disclosure is not limited thereto.
Extracting flow quintuples used by a user from a detector of a communication network, and intercepting flow data accessed by the user; the flow quintuple comprises an intranet user IP, a port, a target server IP and a port.
Obtaining target IP aggregation behavior characteristics according to user access in a specified time window granularity;
a behavioral characteristics matrix is generated based on each source IP of the behavioral characteristics.
And acquiring behavior information of accessing the target server in a preset time period based on the intranet user.
The behavior information of the access target server comprises: the intranet user accesses the flow information, time information and the information of the application protocol generated by each target server.
Counting the target servers connected with the intranet users in a preset time period, and forming a target server IP list;
counting an IP list of an intranet user and a connected target server in a preset time period, the sum of uplink and downlink flow generated by connection with each target server, and the total frequency of connection with the target server;
counting the ending time of each connection of an intranet user in a preset time period minus the starting time, namely, the total duration of connection with each target server and the duration of the last connection time of each target server from the current time, and representing the total duration as the activity;
and counting the number of commonly used application layer protocol types of the intranet user connected with each target server in the time period.
And reconstructing a matrix, and respectively aggregating according to the Source IP of the user and the Dist IP of the access server based on the obtained statistical characteristics. Each user generates a behavior feature matrix.
The user behavior feature matrix is shown in the following table 1, and is constructed by the following steps:
TABLE 1
dstip_1 dstip_2 dstip_3 dstip_4 …… dstip_n
Frequency of connection
Upstream flow rate
Downstream flow rate
Protocol type
Duration of connection
Final connection
Calculating similarity
a) Feature engineering
Based on the step of acquiring the behavior feature matrix, the preprocessed data are obtained, and feature engineering is carried out on the feature matrix to meet the requirements of an algorithm:
transforming adjacent element columns in the coarse granularity case; as the total aggregation of the behavior information in the preset time period, the original feature matrix stores the long-term behavior information of the user, but the correlation between different target servers is not further mined, and two servers with adjacent number segments often have more related information, so the following transformation is performed on the basis of the behavior feature matrix of table 1:
Figure BDA0002800457770000111
and finally outputting a behavior characteristic square matrix with the size of the number of the access target servers.
b) Correlation coefficient between behavior vectors
Assuming two users Q and C, the correlation coefficient between them is calculated:
in order to facilitate the calculation process, the behavior feature matrix obtained in the feature engineering step is straightened into a row of vectors, and then the calculation of the following formula is performed, wherein the similarity of behaviors between users Q and C is in the form of table 3 under the condition that only flow data are used:
Figure BDA0002800457770000112
TABLE 3 Table 3
User similarity matrix example Q
C Similarity(Q,C)
Locating anomalous users
And finding out abnormal users in the group in the intranet according to the similarity.
a) Based on the similarity matrix between all users obtained in the above steps, it is assumed that all N users are self-classified initially.
b) The two users with the greatest similarity in the matrix are gathered into one type, then the distance between the two users and other users is calculated, and the distance between the users H and K is as follows assuming that the K type is formed by combining the users I and J: .
c) Updating the matrix of the similarity, repeating the method of the step b, and performing iterative calculation.
And setting a threshold value according to the actual service condition, stopping the calculation process after the expected effect is achieved, and deducing users with abnormal behaviors if the users are in a single class at the moment.
The invention has a plurality of types of restoration service scenes and overcomes a plurality of pain points. In a real scene, the similarity degree between users can be measured from the daily behavior characteristics of the users, and the false alarm rate of anomaly detection is reduced. And the false alarm rate of abnormal detection is reduced. The abnormal use hidden in the group can be identified, and the safety of the data information is ensured.
Based on the method, the invention also provides a terminal device for realizing the abnormal user detection method based on the behavior aggregation characteristics, comprising the following steps: a memory for storing a computer program and an abnormal user detection method based on behavior aggregation characteristics; and the processor is used for executing the computer program and the abnormal user detection method based on the behavior aggregation characteristics so as to realize the steps of the abnormal user detection method based on the behavior aggregation characteristics.
Based on the above method, the present invention further provides a readable storage medium having an abnormal user detection method based on behavior aggregation characteristics, and a computer program stored on the readable storage medium is executed by a processor to implement the steps of the abnormal user detection method based on behavior aggregation characteristics.
The terminal device includes a central processing unit (CPU, central Processing Unit) which can be based on a program stored in a Read-Only Memory (ROM) and an abnormal user detection method based on a behavior aggregation feature. Or a program loaded from a storage section into a random access memory (RAM, random Access Memory) to perform an abnormal user detection method based on a behavior aggregation feature. In the RAM, various programs and data required for the system operation are also stored. The CPU, ROM and RAM are connected to each other by a bus. An input/output (I/O) interface is also connected to the bus.
The following components are connected to the I/O interface: an input section including a keyboard, a mouse, etc.; an output section including a Cathode Ray Tube (CRT), a liquid crystal display (LCD, liquid Crystal Display), and the like, and a speaker, and the like; a storage section including a hard disk or the like; and a communication section including a network interface card such as a LAN (Local Area Network ) card, a modem, or the like. The communication section performs communication processing via a network such as the internet. The drives are also connected to the I/O interfaces as needed. Removable media such as magnetic disks, optical disks, magneto-optical disks, semiconductor memories, and the like are mounted on the drive as needed so that a computer program read therefrom is mounted into the storage section as needed.
The terminal device implementing the behavior-aggregation-feature-based abnormal user detection method is the unit and algorithm steps of each example described in connection with the embodiments disclosed herein, and can be implemented in electronic hardware, computer software, or a combination of both, and to clearly illustrate the interchangeability of hardware and software, the components and steps of each example have been generally described in terms of functionality in the foregoing description. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (4)

1. An abnormal user detection method based on behavior aggregation features is characterized by comprising the following steps:
counting the behavior characteristics of the access target server in a preset time period;
the behavior characteristics of the access target server in the statistical preset time period comprise:
counting the target servers connected with the intranet users in a preset time period, and forming a target server IP list;
counting an IP list of an intranet user and a connected target server in a preset time period, the sum of uplink and downlink flow generated by connection with each target server, and the total frequency of connection with the target server;
counting the ending time of each connection of the intranet user in a preset time period minus the starting time, namely the total duration of connection with each target server and the duration of the last connection time of each target server from the current time;
counting the number of commonly used application layer protocol types of the intranet user connected with each target server in the time period;
aggregating the behavior characteristics according to the target IP accessed by the user in a specified time window granularity;
generating an original behavior feature matrix based on each source IP of the behavior feature;
based on the original behavior feature matrix, performing adjacent element column transformation to obtain a behavior matrix with the size being the number of the target servers, and configuring the matrix of each user into a row of vectors;
calculating correlation coefficients between any two users respectively to serve as behavior similarity of the two users in the intranet group in a period of time;
searching two users with the maximum similarity according to the similarity matrix among the users, and gathering the two users into a class;
calculating the similarity between the class and other users, updating and gathering a similarity matrix of the class of users, and repeating iterative calculation;
after repeated iterative computation reaches a preset threshold, stopping the clustering process, wherein the user who breaks away from the intranet group is identified as having abnormal behavior;
the adjacent element column transformation includes:
cross multiplying adjacent elements of characteristic dimension, and piecing up to form a new behavior characteristic matrix:
Figure QLYQS_1
wherein m and n are both the intervals of [1, num_distip ];
m and n are used to represent the subscripts of the elements, respectively, the column index of the matrix;
i represents a row index of the behavior matrix;
p represents an element in the original behavior feature matrix.
2. The method for detecting an abnormal user based on a behavior aggregation feature according to claim 1, wherein,
the calculating of the correlation coefficient between any two users comprises the following steps of;
similarity between two user behavior vectors, configuring a user behavior feature matrix into a row of vectors, executing similarity calculation, and calculating by using the following formula:
Figure QLYQS_2
x and Y are behavior vectors of two users;
n represents the length of the vector;
EX, EY represents the mean of the two vectors.
3. A terminal device for implementing a method for detecting abnormal users based on behavior aggregation features, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of the behavior aggregation feature based abnormal user detection method of any one of claims 1 to 2.
4. A readable storage medium, characterized in that it has stored thereon a computer program that is executed by a processor to implement the steps of the behavior aggregation feature based abnormal user detection method according to any one of claims 1 to 2.
CN202011347823.3A 2020-11-26 2020-11-26 Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium Active CN112488175B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011347823.3A CN112488175B (en) 2020-11-26 2020-11-26 Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011347823.3A CN112488175B (en) 2020-11-26 2020-11-26 Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN112488175A CN112488175A (en) 2021-03-12
CN112488175B true CN112488175B (en) 2023-06-23

Family

ID=74935842

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011347823.3A Active CN112488175B (en) 2020-11-26 2020-11-26 Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN112488175B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117074248A (en) * 2023-04-18 2023-11-17 国网宁夏电力有限公司中卫供电公司 SF after digital transformation 6 Method and system for monitoring gas density

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005216066A (en) * 2004-01-30 2005-08-11 Internatl Business Mach Corp <Ibm> Error detection system and method therefor
CN108270620A (en) * 2018-01-15 2018-07-10 深圳市联软科技股份有限公司 Network anomaly detection method, device, equipment and medium based on Portrait brand technology
CN108322428A (en) * 2017-01-18 2018-07-24 阿里巴巴集团控股有限公司 A kind of abnormal access detection method and equipment
CN111431909A (en) * 2020-03-27 2020-07-17 南京聚铭网络科技有限公司 Method and device for detecting grouping abnormity in user entity behavior analysis and terminal
CN111586001A (en) * 2020-04-28 2020-08-25 咪咕文化科技有限公司 Abnormal user identification method and device, electronic equipment and storage medium
CN111641629A (en) * 2020-05-28 2020-09-08 腾讯科技(深圳)有限公司 Abnormal behavior detection method, device, equipment and storage medium
CN111784392A (en) * 2020-06-29 2020-10-16 中国平安财产保险股份有限公司 Abnormal user group detection method, device and equipment based on isolated forest

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050210027A1 (en) * 2004-03-16 2005-09-22 International Business Machines Corporation Methods and apparatus for data stream clustering for abnormality monitoring
US11178161B2 (en) * 2019-04-18 2021-11-16 Oracle International Corporation Detecting anomalies during operation of a computer system based on multimodal data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005216066A (en) * 2004-01-30 2005-08-11 Internatl Business Mach Corp <Ibm> Error detection system and method therefor
CN108322428A (en) * 2017-01-18 2018-07-24 阿里巴巴集团控股有限公司 A kind of abnormal access detection method and equipment
CN108270620A (en) * 2018-01-15 2018-07-10 深圳市联软科技股份有限公司 Network anomaly detection method, device, equipment and medium based on Portrait brand technology
CN111431909A (en) * 2020-03-27 2020-07-17 南京聚铭网络科技有限公司 Method and device for detecting grouping abnormity in user entity behavior analysis and terminal
CN111586001A (en) * 2020-04-28 2020-08-25 咪咕文化科技有限公司 Abnormal user identification method and device, electronic equipment and storage medium
CN111641629A (en) * 2020-05-28 2020-09-08 腾讯科技(深圳)有限公司 Abnormal behavior detection method, device, equipment and storage medium
CN111784392A (en) * 2020-06-29 2020-10-16 中国平安财产保险股份有限公司 Abnormal user group detection method, device and equipment based on isolated forest

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于相似度分析的电力信息内网用户行为异常预警方法;金倩倩 等;《计算机***应用》;20171215;第26卷(第12期);第220-226页 *
基于相似度聚类分析方法的异常入侵检测***的模型及实现;王丽娜 等;《小型微型计算机***》;20040731;第25卷(第7期);第1333-1336页 *
基于聚类的用户特征分析;何堃;《中国优秀硕士学位论文全文数据库》;20090215;第1-70页 *

Also Published As

Publication number Publication date
CN112488175A (en) 2021-03-12

Similar Documents

Publication Publication Date Title
US20210092150A1 (en) Advanced cybersecurity threat mitigation using behavioral and deep analytics
US11297088B2 (en) System and method for comprehensive data loss prevention and compliance management
CN112491877A (en) User behavior sequence anomaly detection method, terminal and storage medium
US10432660B2 (en) Advanced cybersecurity threat mitigation for inter-bank financial transactions
AU2019200530B2 (en) Identifying network security risks
US10248910B2 (en) Detection mitigation and remediation of cyberattacks employing an advanced cyber-decision platform
US10320827B2 (en) Automated cyber physical threat campaign analysis and attribution
US10305917B2 (en) Graph-based intrusion detection using process traces
CN109842628A (en) A kind of anomaly detection method and device
US11074652B2 (en) System and method for model-based prediction using a distributed computational graph workflow
JP6838560B2 (en) Information analysis system, information analysis method, and program
CN113452656B (en) Method, apparatus, electronic device and computer readable medium for identifying abnormal behavior
US11477245B2 (en) Advanced detection of identity-based attacks to assure identity fidelity in information technology environments
CN110515968B (en) Method and apparatus for outputting information
EP3494506A1 (en) Detection mitigation and remediation of cyberattacks employing an advanced cyber-decision platform
US11108835B2 (en) Anomaly detection for streaming data
CN112488175B (en) Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium
US20230113332A1 (en) Advanced detection of identity-based attacks to assure identity fidelity in information technology environments
WO2019018829A1 (en) Advanced cybersecurity threat mitigation using behavioral and deep analytics
US9154515B1 (en) Systems and methods identifying and reacting to potentially malicious activity
CN109063721A (en) A kind of method and device that behavioural characteristic data are extracted
CN110781314B (en) Hierarchical display method and device of user relationship graph and electronic equipment
CN112738087A (en) Attack log display method and device
US20220350908A1 (en) Taking Action Based on Data Evolution
CN115378746B (en) Network intrusion detection rule generation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant