CN110351307A - Abnormal user detection method and system based on integrated study - Google Patents

Abnormal user detection method and system based on integrated study Download PDF

Info

Publication number
CN110351307A
CN110351307A CN201910751220.0A CN201910751220A CN110351307A CN 110351307 A CN110351307 A CN 110351307A CN 201910751220 A CN201910751220 A CN 201910751220A CN 110351307 A CN110351307 A CN 110351307A
Authority
CN
China
Prior art keywords
user
information
abnormal
default
doubtful
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910751220.0A
Other languages
Chinese (zh)
Other versions
CN110351307B (en
Inventor
莫凡
范渊
刘博�
何帅
孙佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
DBAPPSecurity Co Ltd
Hangzhou Dbappsecurity Technology Co Ltd
Original Assignee
Hangzhou Dbappsecurity Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dbappsecurity Technology Co Ltd filed Critical Hangzhou Dbappsecurity Technology Co Ltd
Priority to CN201910751220.0A priority Critical patent/CN110351307B/en
Publication of CN110351307A publication Critical patent/CN110351307A/en
Application granted granted Critical
Publication of CN110351307B publication Critical patent/CN110351307B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Computer And Data Communications (AREA)

Abstract

The present invention provides a kind of abnormal user detection method and system based on integrated study, are related to the technical field of network security, the behavioural information to be detected including acquiring user, wherein behavioural information to be detected includes at least one behavior characteristic information;By behavior characteristic information, and default feature baseline corresponding with behavior characteristic information is compared, and obtains comparison result;Abnormal behaviour information is extracted from behavioural information to be detected according to comparison result, and the user that abnormal behaviour information will be present is determined as doubtful abnormal user;It is finally scored using default integrated study model doubtful abnormal user, the doubtful abnormal user that appraisal result reaches preset fraction is determined as abnormal user.The present invention establishes detection system by kernel object of user, and abnormal user can be accurately positioned based on default integrated study model, find inside threat in time, and then terminate inside threat in time, prevent information leakage.

Description

Abnormal user detection method and system based on integrated study
Technical field
The present invention relates to technical field of network security, more particularly, to a kind of abnormal user detection side based on integrated study Method and system.
Background technique
With increasingly developed, in-depth promotion of the China in big data strategic level of Internet technology, data collection station It is more and more, include that type is more and more abundant, data have become one of assets of enterprise key.In data value by height While attention, enterprise faces various also more serious aiming at the problem that data safety threatens, and security assurance information gradually gathers Coke is the safety guarantee of data.
Under normal conditions, external attack is many kinds of, continuous high frequency, and enterprise gets used to for resource being arranged in that construct safety anti- Shield fort is to resist from external attack, however in addition to external hacker attack, internal staff's participation information peddled, shared the The violation leakage event of tripartite also emerges one after another.
For having appreciated that the enterprise of problem urgency, using traditional safe practice fail to that them is helped to have Effect ground is solved from internal safety problem.Reason be conventional method be mostly it is dispersing, subsequent, lack specific aim.
Summary of the invention
The purpose of the present invention is to provide a kind of abnormal user detection method and system based on integrated study, can be timely It was found that internal staff's participation information peddles, shares third-party violation leakage event, property loss is avoided.
A kind of abnormal user detection method based on integrated study provided by the invention, wherein include: acquire user to Detect behavioural information, wherein the behavioural information to be detected includes at least one behavior characteristic information;The behavioural characteristic is believed Breath, and default feature baseline corresponding with the behavior characteristic information are compared, and obtain comparison result;According to the comparison As a result abnormal behaviour information is extracted from the behavioural information to be detected, and the user that the abnormal behaviour information will be present determines For doubtful abnormal user;It is scored using default integrated study model the doubtful abnormal user, appraisal result is reached The doubtful abnormal user of preset fraction is determined as abnormal user.
Further, default feature baseline user group as belonging to the user determines;Or, the default feature base Line is determined by the historical information of the user.
Further, the default integrated study model is the fusion of at least two detection algorithms;It is described to utilize default collection Scoring is carried out to the doubtful abnormal user at learning model and comprises determining that at least two detection algorithm and every kind of inspection The weight of method of determining and calculating;It is utilized respectively at least two detection algorithm to score to the doubtful abnormal user, obtain at least Two abnormal scores;Based on the weight, at least two abnormal score is merged, appraisal result is obtained.
Further, the detection algorithm includes: isolated forest algorithm, One class SVM, local outlier factor calculation Method.
Further, method further include: acquire identity information, the entity information of the user;Based on the row to be detected For information, the identity information and the entity information are associated, restore the network session of the user.
A kind of abnormal user detection system based on integrated study provided by the invention, wherein include: the first acquisition mould Block, for acquiring the behavioural information to be detected of user, wherein the behavioural information to be detected includes at least one behavioural characteristic letter Breath;Comparison module, for by the behavior characteristic information, and default feature baseline corresponding with the behavior characteristic information into Row compares, and obtains comparison result;Extraction module, it is different for being extracted from the behavioural information to be detected according to the comparison result Normal behavioural information, and the user that the abnormal behaviour information will be present is determined as doubtful abnormal user;Grading module, for utilizing Default integrated study model scores to the doubtful abnormal user, and appraisal result is reached to the doubtful abnormal use of preset fraction Family is determined as abnormal user.
Further, the default integrated study model is the fusion of at least two detection algorithms;Institute's scoring module packet It includes: determination unit, for determining the weight of at least two detection algorithm and every kind of detection algorithm;Score unit, uses It scores in being utilized respectively at least two detection algorithm the doubtful abnormal user, obtains at least two abnormal points Value;At least two abnormal score is merged for being based on the weight, obtains appraisal result by integrated unit.
Further, system further include: the second acquisition module, for acquiring identity information, the entity information of the user; The identity information and the entity information are associated, restore for being based on the behavioural information to be detected by relating module The network session of the user.
The present invention also provides a kind of electronic equipment, including memory, processor, being stored in the memory can be described The computer program run on processor, wherein the processor realizes the method when executing the computer program.
The present invention also provides it is a kind of with processor can be performed non-volatile program code computer-readable medium, In, said program code makes the processor execute the method.
A kind of abnormal user detection method and system based on integrated study provided by the invention first acquires the to be checked of user Survey behavioural information, wherein behavioural information to be detected includes at least one behavior characteristic information;Then by behavior characteristic information, with And default feature baseline corresponding with behavior characteristic information is compared, and obtains comparison result;Further according to comparison result to be checked It surveys and extracts abnormal behaviour information in behavioural information, and the user that abnormal behaviour information will be present is determined as doubtful abnormal user;Most It is scored afterwards using default integrated study model doubtful abnormal user, appraisal result is reached to the doubtful exception of preset fraction User is determined as abnormal user.The present invention establishes detection system by kernel object of user, can be to multiple behavior characteristic informations It is analyzed, the abnormal behaviour information that the comparison result of Behavior-based control characteristic information and default feature baseline determines can be effectively anti- User exception is answered, abnormal user can be accurately positioned based on default integrated study model, find inside threat in time, in turn Inside threat is terminated in time, prevents information leakage.
Detailed description of the invention
It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art Embodiment or attached drawing needed to be used in the description of the prior art be briefly described, it should be apparent that, it is described below Attached drawing is some embodiments of the present invention, for those of ordinary skill in the art, before not making the creative labor It puts, is also possible to obtain other drawings based on these drawings.
Fig. 1 is a kind of flow chart of the abnormal user detection method of integrated study provided in an embodiment of the present invention;
Fig. 2 is employee's account 24 hours online probability density distribution figures of certain company;
Fig. 3 is the clustering figure of engineering department and sales department in access times, access duration;
Fig. 4 is flow timing diagram of certain user in June;
Fig. 5 is the flow chart of step S104 in Fig. 1;
Fig. 6 is a kind of structural schematic diagram of the abnormal user detection system of integrated study provided in an embodiment of the present invention;
Fig. 7 is the structural schematic diagram of grading module;
Fig. 8 is the structural schematic diagram of the abnormal user detection system of another integrated study provided in an embodiment of the present invention.
Icon:
The first acquisition module of 11-;12- comparison module;13- extraction module;14- grading module;The second acquisition module of 15-; 16- relating module;21- determination unit;22- scoring unit;23- integrated unit.
Specific embodiment
Technical solution of the present invention is clearly and completely described below in conjunction with embodiment, it is clear that described reality Applying example is a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, the common skill in this field Art personnel every other embodiment obtained without making creative work belongs to the model that the present invention protects It encloses.
Currently, internal staff's participation information peddles, shares third-party violation leakage event and frequently occur, for For enterprise through recognizing problem urgency, fail to that them is helped to efficiently solve from interior using traditional safe practice The safety problem in portion.Based on this, the abnormal user detection method based on integrated study that the embodiment of the invention provides a kind of and it is System, is the detection system established using user as kernel object, can analyzed multiple behavior characteristic informations, and Behavior-based control is special The abnormal behaviour information that the comparison result of reference breath and default feature baseline determines can be based on effecting reaction user exception Abnormal user can be accurately positioned in default integrated study model, find inside threat in time, and then terminate inside threat in time, shut out Exhausted information leakage.
Currently, it is more and more with the period is long, frequency is low, it is hidden by force for the non-obvious attack of characteristic feature can around tradition Safety detection method causes to damage to mass data.Currently, user subject behavioural analysis (User and Entity Behavior Analytics, UEBA) system is just as a kind of emerging abnormal user detection architecture gradually to overturn tradition anti- Soldier in charge of pack animals's section, network security is safeguarded from " Passive Defence " and goes to " making an initiative sally ".
For convenient for understanding the present embodiment, first to a kind of based on integrated study disclosed in the embodiment of the present invention Abnormal user detection method describes in detail.
Embodiment one:
Referring to Fig.1, the abnormal user detection method based on integrated study that the embodiment of the invention provides a kind of, wherein can With the following steps are included:
Step S101 acquires the behavioural information to be detected of user, wherein behavioural information to be detected includes at least one behavior Characteristic information.
In embodiments of the present invention, behavioural information to be detected is behavioural information to be detected, and behavioural information can according to classification It is divided into network behavior information and terminal behavior information.It should be noted that behavioural information is provided by user behavior data source.User Behavioral data source includes but is not limited to security log, network flow, threatens information, identity access correlation log and access and use The relevant log of family scene.Wherein, log relevant to user's scene disappears including but not limited to VPN log, OA log, employee job card Take log, gate inhibition's brush face log.
Step S102, by behavior characteristic information, and default feature baseline corresponding with behavior characteristic information is compared, Obtain comparison result;
In embodiments of the present invention, each behavior characteristic information corresponds to a default spy corresponding with behavior characteristic information Levy baseline.Default feature baseline can be determined as the user group belonging to user;Or, historical information of the default feature baseline by user It determines.When default feature baseline user group as belonging to user determines, different default features is can be set in different user groups Baseline.For same default feature baseline set by user, default feature baseline can be determined by the historical information of user.It compares As a result can refer to that behavior characteristic information is compared with default feature baseline as a result, being divided into: behavior characteristic information and pre- If feature baseline matches, behavior characteristic information and default feature baseline mismatch two types.It should be noted that a user Behavioural information to be detected may include multiple behavior characteristic informations, therefore, multiple behavior characteristic informations of user are corresponding Multiple comparison results.
Step S103 extracts abnormal behaviour information according to comparison result from behavioural information to be detected, and exception will be present The user of behavioural information is determined as doubtful abnormal user;
In embodiments of the present invention, abnormal behaviour information refers to that comparison result is unmatched behavior characteristic information.One Behavior characteristic information corresponds to a comparison result, and therefore, all behavior characteristic informations of user correspond to multiple comparison results.It will Comparing result is that unmatched behavior characteristic information is extracted as abnormal behaviour information, and by the user of abnormal behaviour information It is determined as doubtful abnormal user.Abnormal behaviour information present in doubtful abnormal user can be one, or multiple.
Step S104, scores to doubtful abnormal user using default integrated study model, and appraisal result is reached pre- If the doubtful abnormal user of score is determined as abnormal user.
In embodiments of the present invention, it presets integrated study model and refers to trained integrated study model in advance.It will be doubtful Each behavior characteristic information of abnormal user is input to default integrated study model and obtains appraisal result as input.Scoring knot Fruit can refer to that the scoring score of doubtful abnormal user, scoring score determine the severity of user behavior exception, and score score It is higher, show the more serious of user behavior exception.In addition, the conditions such as different application scenarios, different testing requirements are to default The design of score has an impact, and therefore, preset fraction can be with self-definition design.
The abnormal user detection method based on integrated study that the embodiment of the invention provides a kind of first acquires the to be checked of user Survey behavioural information, wherein behavioural information to be detected includes at least one behavior characteristic information;Then by behavior characteristic information, with And default feature baseline corresponding with behavior characteristic information is compared, and obtains comparison result;Further according to comparison result to be checked It surveys and extracts abnormal behaviour information in behavioural information, and the user that abnormal behaviour information will be present is determined as doubtful abnormal user;Most It is scored afterwards using default integrated study model doubtful abnormal user, appraisal result is reached to the doubtful exception of preset fraction User is determined as abnormal user.The embodiment of the present invention establishes detection system by kernel object of user, can be special to multiple behaviors Reference breath is analyzed, and the abnormal behaviour information that the comparison result of Behavior-based control characteristic information and default feature baseline determines can be with Abnormal user can be accurately positioned based on default integrated study model in effecting reaction user exception, find internal prestige in time The side of body, and then inside threat is terminated in time, prevent information leakage.
In embodiments of the present invention, step S102, by behavior characteristic information, and it is corresponding with behavior characteristic information default Feature baseline is compared, and obtains comparison result;Step S103 is extracted from behavioural information to be detected abnormal according to comparison result Behavioural information, and it may include following situations that the user that abnormal behaviour information will be present, which is determined as doubtful abnormal user:
Situation 1:
Based on the similar principle of behavioural information of user most of in same user group, to different user in same user group Same behavior characteristic information analyzed, obtain default feature baseline.Based on default feature baseline, it can determine that deviation is default The a few users of feature baseline.The corresponding characteristic dimension of one behavior characteristic information is above-mentioned few in some characteristic dimension Number user is exactly doubtful abnormal user.Such as: behavior characteristic information can refer to the user job time, and default feature baseline can refer to Preset operating time.The method of determination of preset operating time is as follows: carrying out to the history working time of all employees of same department Record, and user is calculated at one day using Density Estimator (Kernel Density Estimation, KDE) based on the record Online probability is greater than or equal to the working time work for presetting online threshold value by the online probability of interior all time points (using day as the period) For preset operating time, online probability is lower than the working time for presetting online threshold value as the default non-working time.
Specifically, as shown in Fig. 2, Fig. 2 is employee's account 24 hours online probability density distribution figures of certain company, wherein Abscissa is time point, ordinate be user account at every point of time on online probability.By above-mentioned probability density distribution figure It is found that employee's account is in 16 points of online maximum probability on daytime.If presetting online threshold value is 0.01, the employee of the said firm is being insulted 3 points to 6 points of morning of online probability of morning, which is lower than, presets online threshold value.3:00 AM is arrived at 0 point of morning, there are also part persons for the said firm Work is worked overtime using VPN, shows that the said firm's overtime work is serious, overtime work is normality to 1,2 point of morning.If directly definition is at night 22 points are the non-working time to 6:00 AM, will lead to more wrong report.The embodiment of the present invention is based on Density Estimator, can be certainly Adaptively learn, obtains the preset operating time of the said firm.
Situation 2:
For same behavior characteristic information, different user groups correspond to different default feature baselines.User group can refer to Each department in enterprise, due to each department for same behavior characteristic information there are larger difference, different departments it Between have different default feature baselines.Such as: there are larger differences on working forms for engineering department and sales department, that is, exist There is larger difference in network behavior and terminal behavior.
Employee based on different departments has the feature of different role attributes, accessing united resource positioning symbol (Uniform Resource Locator, URL) record information.Obviously, the bright same role attribute of record information list or have with the employee of department Common access object and access purpose.According to log information, can establish be accessed in employee and certain period it is more or The incidence matrix of person URL relevant to business, matrix element can refer to access times, access duration or average access duration.Benefit The distance between employee can be calculated with Euclidean distance, cluster operation can be carried out based on the distance between above-mentioned employee. Cluster operation can will be far from the employee of self-role department, be labeled as doubtful abnormal user, specifically, behavioural characteristic is believed Breath can refer to the irrelevance of employee, and default feature baseline can refer to default irrelevance, by the irrelevance of employee and default irrelevance It is compared, the employee that degree of will deviate from is greater than default irrelevance is labeled as doubtful abnormal user.Cluster operation can determine simultaneously It multiple groups, organizes as class cluster, one cluster centre of each group of correspondence.Based on the distance of employee to cluster centre, can be calculated The irrelevance of each employee.The calculation formula of irrelevance is as follows:
Wherein, DiFor the irrelevance of i-th of employee, diFor the distance of i-th of employee to cluster centre, dmeanIt is arrived for employee The average distance of cluster centre.
Fig. 3 is the clustering figure of engineering department and sales department in access times, access duration, referring to Fig. 3, circle Presentation technology department, triangle indicate sales department, and five-pointed star indicates above-mentioned two group of cluster centre, among two class clusters The several users being scattered are determined as doubtful abnormal user.In addition, mixing if directly carrying out clustering regardless of user group Triangle in circle is then considered as normal users.The present embodiment is distinguished user group and is clustered, it is found that is mingled in Class cluster central point of the triangle apart from sales department in circle is far, can determine it as doubtful abnormal user.
Situation 3:
In embodiments of the present invention, behavior characteristic information is the data of discrete form, can refer to the address information of user, in advance If feature baseline can refer to that historical baseline, abnormal behaviour information are abnormal address information.Historical baseline is a large amount of by learning What historical information determined, abnormal address information can be extracted to the user for deviateing historical baseline based on above-mentioned historical baseline.For example, Certain user uses new IP address, and above-mentioned new IP address never occurred in historical information.Above-mentioned new IP address meaning The address information of user deviate from original track, and the phenomenon that above-mentioned deviation original track, can be user and go on business Caused by odjective cause.User can be judged in conjunction with some other information, such as: new IP address is along with new MAC Address then means that user not only changes entry address, while also changing logging device, which has aggravated user Suspicious degree.If the new IP address of user continuously emerges, this behavior characteristic information is determined as abnormal behaviour information.Pass through The imagination of special scenes, and the abnormal behaviour information of discrete data form can be extracted based on user itself historical baseline.
Situation 4:
In embodiments of the present invention, behavior characteristic information is the data of conitnuous forms, can refer to the flow information of user, in advance If feature baseline can refer to preset flow baseline, abnormal behaviour information is abnormal flow information.Based on above-mentioned preset flow baseline Abnormal flow information can be extracted to the user for deviateing preset flow baseline.Specifically, the normal network behavior of user is certain Have fluctuation in range goes out inbound traffics, and DPI system can recorde the traffic conditions of each access target of user.The discrepancy of user Flow is continuous variable, it should meet certain distribution.Assuming that the flowing of access of user continues significantly far from historical rethinking, then It has reason to suspect that the use habit of user changes, needs to keep a close eye on this.It is calculated by using RPCA-SST, ARIMA etc. Method can carry out abnormality detection this kind of continuous time series data, to extract abnormal flow information.
Fig. 4 is flow timing diagram of certain user in June, and referring to Fig. 4, dotted line is that timing Outlier Detection Algorithm is fitted just Normal range, solid line are the actual flow timing of user.If the actual flow timing of user is fallen in except two dotted lines, by user Flow beyond normal range (NR) is labeled as abnormal point (circle in Fig. 4).It can be mentioned according to the number of abnormal point and intensity of anomaly Take the abnormal flow information of the user.
Further, the fusion that integrated study model is at least two detection algorithms is preset;Wherein, detection algorithm includes: Isolated forest algorithm, One class SVM, local outlier factor algorithm.
In embodiments of the present invention, the corresponding anomalous event of abnormal behaviour information, occurrence frequency is the same as a large amount of positive ordinary affair Part is compared to only minority.The purpose of abnormality detection, which is to concentrate from user data, extracts the abnormal data of small probability, these Abnormal data is not due to what random deviation generated, but what the mechanism entirely different just like failure, threat, invasion etc. generated. Outlier Detection Algorithm is numerous, although their expectation is all to isolate normal data and abnormal data, principle as much as possible Different.For different data sources, it is difficult to ensure which kind of algorithm can obtain optimal result.Therefore, this implementation Example is using the integrated come to the progress of various abnormal users of three kinds of algorithms such as isolated forest, One Class SVM, local outlier factor Identification and evaluation comprehensively.
It may comprise steps of referring to Fig. 5, step S104:
Step S201 determines the weight of at least two detection algorithms and every kind of detection algorithm.
In embodiments of the present invention, the present embodiment is using isolated forest, One Class SVM, three kinds of local outlier factor Algorithm, the weight of every kind of detection algorithm are respectively P1, P2, P3.
Step S202 is utilized respectively at least two detection algorithms and scores doubtful abnormal user, obtains at least two Abnormal score.
The embodiment of the present invention is using three kinds of isolated forest, One Class SVM, local outlier factor algorithms respectively to doubtful Independent abnormal score is calculated in abnormal user i, is denoted as S1, S2, S3.
Step S203 is based on weight, at least two abnormal scores is merged, appraisal result is obtained.
In embodiments of the present invention, doubtful abnormal user i appraisal result is Score:
Wherein, Score is appraisal result, and Si is detection algorithm, and Pi is the weight of detection algorithm.
It is scored using fusion method, the appraisal result of available doubtful abnormal user.Determine the side of abnormal user There are two types of formulas, mode one: the doubtful abnormal user that appraisal result reaches preset fraction is determined as abnormal user.Mode two: right The appraisal result of doubtful abnormal user ranking from high to low takes the user of n before ranking to be determined as abnormal user.Wherein, the value of n It can rule of thumb define, general n 10.Table 1 gives the appraisal result of user.
The appraisal result of 1 user of table
For mode two, the first row user_hash of table indicates doubtful abnormal user it can be seen from above-mentioned table 1, Score is appraisal result, and the appraisal result in table 1 arranges from high to low.X1, X2, X3, X4, X5, X6 are respectively doubtful abnormal use The characteristic value of each behavior characteristic information in family, if characteristic value is 0, then it represents that the behavior feature of doubtful abnormal user is not abnormal Behavioural information;If characteristic value is non-zero numerical value, then it represents that the behavior characteristic information of doubtful abnormal user is abnormal behaviour information. For same behavior characteristic information, the size of characteristic value may indicate that abnormal severity.
Further, method further include: acquire identity information, the entity information of user;It, will based on behavioural information to be detected Identity information and entity information are associated, also the network session of original subscriber.
In embodiments of the present invention, identity information includes true identity information and virtual identity information, true identity information Employee's data for being provided including but not limited to personnel department of acquisition source;The acquisition source of virtual identity information including but not limited to Materials for registration of the family on network;By uniform data dictionary, the field information of different logs can be unified, and then be associated with different The user information of log can achieve the specific user of positioning by association true identity information and virtual identity information.
The entity information of user can refer to the unique identity of user in network, and entity information is including but not limited to IP Location, MAC Address.The entity information of user can be associated with identity information by the behavioural information to be detected of user, i.e., in fact Existing employee's account is associated with entity asset.User behavior information can complete user and entity by analysis, completion, integration Association, while also completely reducing the network session of user and the user behavior of ession for telecommunication.
The embodiment of the present invention passes through the association of user, entity, behavior three elements, and combing obtains all kinds of behavior characteristic informations, Then the feature extraction dimension for pre-defining four kinds of situations, can effectively extract tens kinds can most reflect that the abnormal basis of user is special Sign;Three kinds of Outlier Detection Algorithms are used in abnormal user modeling by integrated learning approach again;Finally, by abnormal marking, Position the maximum a collection of user of abnormal risk.And the embodiment of the present invention does not use user the label of account, by one section Time using and checking, and gradually accumulates the label of user account, the algorithm of whole system is made to be transitioned into supervision from unsupervised. Default integrated study model is fed back by continuous forward circulation to be strengthened, and firm security perimeter is eventually erected.
Embodiment two:
Referring to Fig. 6, the abnormal user detection system based on integrated study that the embodiment of the invention provides a kind of, wherein can To comprise the following modules:
First acquisition module 11, for acquiring the behavioural information to be detected of user, wherein behavioural information to be detected includes extremely A few behavior characteristic information;
Comparison module 12, for by behavior characteristic information, and default feature baseline corresponding with behavior characteristic information into Row compares, and obtains comparison result;
Extraction module 13 for extracting abnormal behaviour information from behavioural information to be detected according to comparison result, and will be deposited It is determined as doubtful abnormal user in the user of abnormal behaviour information;
Grading module 14, for being scored using default integrated study model doubtful abnormal user, by appraisal result The doubtful abnormal user for reaching preset fraction is determined as abnormal user.
The abnormal user detection system based on integrated study that the embodiment of the invention provides a kind of, first with the first acquisition mould The behavioural information to be detected of block acquisition user, wherein behavioural information to be detected includes at least one behavior characteristic information;Then sharp With comparison module by behavior characteristic information, and default feature baseline corresponding with behavior characteristic information is compared, and is compared To result;It recycles extraction module to extract abnormal behaviour information from behavioural information to be detected according to comparison result, and will be present The user of abnormal behaviour information is determined as doubtful abnormal user;It is used to utilize default integrated study model finally by grading module It scores doubtful abnormal user, the doubtful abnormal user that appraisal result reaches preset fraction is determined as abnormal user.This Inventive embodiments establish detection system by kernel object of user, can analyze multiple behavior characteristic informations, based on row Be characterized abnormal behaviour information that the comparison result of information and default feature baseline determines can with effecting reaction user exception, Abnormal user can be accurately positioned based on default integrated study model, find inside threat in time, and then terminate internal prestige in time The side of body, prevents information leakage.
Further, the fusion that integrated study model is at least two detection algorithms is preset;Referring to Fig. 7, grading module 14 Include:
Determination unit 21, for determining the weight of at least two detection algorithms and every kind of detection algorithm;
Score unit 22, score for being utilized respectively at least two detection algorithms doubtful abnormal user, obtain to Few two abnormal scores;
At least two abnormal scores are merged for being based on weight, obtain appraisal result by integrated unit 23.
Further, referring to Fig. 8, system further include:
Second acquisition module 15, for acquiring identity information, the entity information of user;
Identity information and entity information are associated, reduction is used by relating module 16 for being based on behavioural information to be detected The network session at family.
In another embodiment of the present invention, also offer a kind of electronic equipment, including memory, processor, the storage The computer program that can be run on the processor is stored in device, the processor is realized when executing the computer program The step of above method embodiment the method.
In another embodiment of the present invention, a kind of non-volatile program code that can be performed with processor is also provided Computer-readable medium, said program code make the processor execute embodiment of the method the method.
In the description of the present invention, it should be noted that term " first ", " second " are used for description purposes only, and cannot It is interpreted as indication or suggestion relative importance.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent Pipe present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: its according to So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into Row equivalent replacement;And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution The range of scheme.

Claims (10)

1. a kind of abnormal user detection method based on integrated study characterized by comprising
Acquire the behavioural information to be detected of user, wherein the behavioural information to be detected includes at least one behavior characteristic information;
By the behavior characteristic information, and default feature baseline corresponding with the behavior characteristic information is compared, and obtains Comparison result;
Abnormal behaviour information is extracted from the behavioural information to be detected according to the comparison result, and the abnormal row will be present It is determined as doubtful abnormal user for the user of information;
It is scored using default integrated study model the doubtful abnormal user, appraisal result is reached into doubting for preset fraction It is determined as abnormal user like abnormal user.
2. the method according to claim 1, wherein default feature baseline user as belonging to the user Group determines;
Or,
The default feature baseline is determined by the historical information of the user.
3. the method according to claim 1, wherein the default integrated study model is that at least two detections are calculated The fusion of method;
It is described using default integrated study model to the doubtful abnormal user carry out scoring include:
Determine the weight of at least two detection algorithm and every kind of detection algorithm;
It is utilized respectively at least two detection algorithm to score to the doubtful abnormal user, obtains at least two abnormal points Value;
Based on the weight, at least two abnormal score is merged, appraisal result is obtained.
4. according to the method described in claim 3, it is characterized in that, the detection algorithm includes: isolated forest algorithm, One Class SVM, local outlier factor algorithm.
5. the method according to claim 1, wherein further include:
Acquire identity information, the entity information of the user;
Based on the behavioural information to be detected, the identity information and the entity information are associated, restore the user Network session.
6. a kind of abnormal user detection system based on integrated study characterized by comprising
First acquisition module, for acquiring the behavioural information to be detected of user, wherein the behavioural information to be detected includes at least One behavior characteristic information;
Comparison module is used for the behavior characteristic information, and default feature baseline corresponding with the behavior characteristic information It is compared, obtains comparison result;
Extraction module is used to extract abnormal behaviour information from the behavioural information to be detected according to the comparison result, and will There are the users of the abnormal behaviour information to be determined as doubtful abnormal user;
Grading module is reached appraisal result for being scored using default integrated study model the doubtful abnormal user Doubtful abnormal user to preset fraction is determined as abnormal user.
7. system according to claim 6, which is characterized in that the default integrated study model is that at least two detections are calculated The fusion of method;
Institute's scoring module includes:
Determination unit, for determining the weight of at least two detection algorithm and every kind of detection algorithm;
Score unit, scores for being utilized respectively at least two detection algorithm the doubtful abnormal user, obtains At least two abnormal scores;
At least two abnormal score is merged for being based on the weight, obtains appraisal result by integrated unit.
8. system according to claim 6, which is characterized in that further include:
Second acquisition module, for acquiring identity information, the entity information of the user;
Relating module, for based on the behavioural information to be detected, the identity information and the entity information to be associated, Restore the network session of the user.
9. a kind of electronic equipment, including memory, processor, be stored in the memory to run on the processor Computer program, which is characterized in that processor realizes such as side described in any one of claim 1 to 5 when executing computer program Method.
10. a kind of computer-readable medium for the non-volatile program code that can be performed with processor, which is characterized in that described Program code makes the processor execute such as method described in any one of claim 1 to 5.
CN201910751220.0A 2019-08-14 2019-08-14 Abnormal user detection method and system based on ensemble learning Active CN110351307B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910751220.0A CN110351307B (en) 2019-08-14 2019-08-14 Abnormal user detection method and system based on ensemble learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910751220.0A CN110351307B (en) 2019-08-14 2019-08-14 Abnormal user detection method and system based on ensemble learning

Publications (2)

Publication Number Publication Date
CN110351307A true CN110351307A (en) 2019-10-18
CN110351307B CN110351307B (en) 2022-01-28

Family

ID=68185095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910751220.0A Active CN110351307B (en) 2019-08-14 2019-08-14 Abnormal user detection method and system based on ensemble learning

Country Status (1)

Country Link
CN (1) CN110351307B (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807488A (en) * 2019-11-01 2020-02-18 北京芯盾时代科技有限公司 Anomaly detection method and device based on user peer-to-peer group
CN111259948A (en) * 2020-01-13 2020-06-09 中孚安全技术有限公司 User safety behavior baseline analysis method based on fusion machine learning algorithm
CN111292008A (en) * 2020-03-03 2020-06-16 电子科技大学 Privacy protection data release risk assessment method based on knowledge graph
CN111506615A (en) * 2020-04-22 2020-08-07 深圳前海微众银行股份有限公司 Method and device for determining occupation degree of invalid user
CN111683048A (en) * 2020-05-06 2020-09-18 浙江大学 Intrusion detection system based on multicycle model stacking
CN111865941A (en) * 2020-07-03 2020-10-30 北京天空卫士网络安全技术有限公司 Abnormal behavior identification method and device
CN111988327A (en) * 2020-08-25 2020-11-24 北京天融信网络安全技术有限公司 Threat behavior detection and model establishment method and device, electronic equipment and storage medium
CN112134862A (en) * 2020-09-11 2020-12-25 国网电力科学研究院有限公司 Coarse-fine granularity mixed network anomaly detection method and device based on machine learning
CN112364286A (en) * 2020-11-23 2021-02-12 北京八分量信息科技有限公司 Method and device for abnormality detection based on UEBA and related product
CN112434245A (en) * 2020-11-23 2021-03-02 北京八分量信息科技有限公司 Method and device for judging abnormal behavior event based on UEBA (unified extensible architecture), and related product
CN113379176A (en) * 2020-03-09 2021-09-10 ***通信集团设计院有限公司 Telecommunication network abnormal data detection method, device, equipment and readable storage medium
CN114422232A (en) * 2022-01-17 2022-04-29 恒安嘉新(北京)科技股份公司 Illegal traffic monitoring method and device, electronic equipment, system and medium
CN114928496A (en) * 2022-05-31 2022-08-19 阿里云计算有限公司 Abnormal behavior detection method and device
CN115146263A (en) * 2022-09-05 2022-10-04 北京微步在线科技有限公司 User account collapse detection method and device, electronic equipment and storage medium
CN115766282A (en) * 2022-12-12 2023-03-07 张家港金典软件有限公司 Data processing method and system for enterprise information safety supervision
CN117195273A (en) * 2023-11-07 2023-12-08 闪捷信息科技有限公司 Data leakage detection method and device based on time sequence data anomaly detection
CN117390708A (en) * 2023-12-11 2024-01-12 南京向日葵大数据有限公司 Privacy data security protection method and system

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170063894A1 (en) * 2015-08-31 2017-03-02 Splunk Inc. Network Security Threat Detection by User/User-Entity Behavioral Analysis
CN107408144A (en) * 2014-11-14 2017-11-28 Zoll医疗公司 Medical precursor event estimation
CN107579855A (en) * 2017-09-21 2018-01-12 桂林电子科技大学 A kind of layering multiple domain visible safety O&M method based on chart database
CN107689956A (en) * 2017-08-31 2018-02-13 北京奇安信科技有限公司 The intimidation estimating method and device of a kind of anomalous event
CN108769048A (en) * 2018-06-08 2018-11-06 武汉思普崚技术有限公司 A kind of secure visualization and Situation Awareness plateform system
CN108809745A (en) * 2017-05-02 2018-11-13 ***通信集团重庆有限公司 A kind of user's anomaly detection method, apparatus and system
CN109164786A (en) * 2018-08-24 2019-01-08 杭州安恒信息技术股份有限公司 A kind of anomaly detection method based on time correlation baseline, device and equipment
CN109756368A (en) * 2018-12-24 2019-05-14 广州市百果园网络科技有限公司 Detection method, device, computer readable storage medium and the terminal of unit exception change
CN109861995A (en) * 2019-01-17 2019-06-07 安徽谛听信息科技有限公司 A kind of safe big data intelligent analysis method of cyberspace, computer-readable medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107408144A (en) * 2014-11-14 2017-11-28 Zoll医疗公司 Medical precursor event estimation
US20170063894A1 (en) * 2015-08-31 2017-03-02 Splunk Inc. Network Security Threat Detection by User/User-Entity Behavioral Analysis
CN108809745A (en) * 2017-05-02 2018-11-13 ***通信集团重庆有限公司 A kind of user's anomaly detection method, apparatus and system
CN107689956A (en) * 2017-08-31 2018-02-13 北京奇安信科技有限公司 The intimidation estimating method and device of a kind of anomalous event
CN107579855A (en) * 2017-09-21 2018-01-12 桂林电子科技大学 A kind of layering multiple domain visible safety O&M method based on chart database
CN108769048A (en) * 2018-06-08 2018-11-06 武汉思普崚技术有限公司 A kind of secure visualization and Situation Awareness plateform system
CN109164786A (en) * 2018-08-24 2019-01-08 杭州安恒信息技术股份有限公司 A kind of anomaly detection method based on time correlation baseline, device and equipment
CN109756368A (en) * 2018-12-24 2019-05-14 广州市百果园网络科技有限公司 Detection method, device, computer readable storage medium and the terminal of unit exception change
CN109861995A (en) * 2019-01-17 2019-06-07 安徽谛听信息科技有限公司 A kind of safe big data intelligent analysis method of cyberspace, computer-readable medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
司德睿,等: "一种基于机器学习的安全威胁分析***", 《信息技术与网络安全》 *
胡绍勇: "基于UEBA的数据泄漏分析", 《信息安全与通信保密》 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807488A (en) * 2019-11-01 2020-02-18 北京芯盾时代科技有限公司 Anomaly detection method and device based on user peer-to-peer group
CN111259948A (en) * 2020-01-13 2020-06-09 中孚安全技术有限公司 User safety behavior baseline analysis method based on fusion machine learning algorithm
CN111292008A (en) * 2020-03-03 2020-06-16 电子科技大学 Privacy protection data release risk assessment method based on knowledge graph
CN113379176B (en) * 2020-03-09 2023-12-19 ***通信集团设计院有限公司 Method, device, equipment and readable storage medium for detecting abnormal data of telecommunication network
CN113379176A (en) * 2020-03-09 2021-09-10 ***通信集团设计院有限公司 Telecommunication network abnormal data detection method, device, equipment and readable storage medium
CN111506615A (en) * 2020-04-22 2020-08-07 深圳前海微众银行股份有限公司 Method and device for determining occupation degree of invalid user
CN111683048B (en) * 2020-05-06 2021-05-07 浙江大学 Intrusion detection system based on multicycle model stacking
CN111683048A (en) * 2020-05-06 2020-09-18 浙江大学 Intrusion detection system based on multicycle model stacking
CN111865941A (en) * 2020-07-03 2020-10-30 北京天空卫士网络安全技术有限公司 Abnormal behavior identification method and device
CN111988327A (en) * 2020-08-25 2020-11-24 北京天融信网络安全技术有限公司 Threat behavior detection and model establishment method and device, electronic equipment and storage medium
CN111988327B (en) * 2020-08-25 2022-07-12 北京天融信网络安全技术有限公司 Threat behavior detection and model establishment method and device, electronic equipment and storage medium
CN112134862A (en) * 2020-09-11 2020-12-25 国网电力科学研究院有限公司 Coarse-fine granularity mixed network anomaly detection method and device based on machine learning
CN112134862B (en) * 2020-09-11 2023-09-08 国网电力科学研究院有限公司 Coarse-fine granularity hybrid network anomaly detection method and device based on machine learning
CN112434245A (en) * 2020-11-23 2021-03-02 北京八分量信息科技有限公司 Method and device for judging abnormal behavior event based on UEBA (unified extensible architecture), and related product
CN112364286A (en) * 2020-11-23 2021-02-12 北京八分量信息科技有限公司 Method and device for abnormality detection based on UEBA and related product
CN114422232A (en) * 2022-01-17 2022-04-29 恒安嘉新(北京)科技股份公司 Illegal traffic monitoring method and device, electronic equipment, system and medium
CN114422232B (en) * 2022-01-17 2024-03-22 恒安嘉新(北京)科技股份公司 Method, device, electronic equipment, system and medium for monitoring illegal flow
CN114928496A (en) * 2022-05-31 2022-08-19 阿里云计算有限公司 Abnormal behavior detection method and device
CN115146263A (en) * 2022-09-05 2022-10-04 北京微步在线科技有限公司 User account collapse detection method and device, electronic equipment and storage medium
CN115766282A (en) * 2022-12-12 2023-03-07 张家港金典软件有限公司 Data processing method and system for enterprise information safety supervision
CN115766282B (en) * 2022-12-12 2024-05-24 张家港金典软件有限公司 Data processing method and system for enterprise information security supervision
CN117195273A (en) * 2023-11-07 2023-12-08 闪捷信息科技有限公司 Data leakage detection method and device based on time sequence data anomaly detection
CN117195273B (en) * 2023-11-07 2024-02-06 闪捷信息科技有限公司 Data leakage detection method and device based on time sequence data anomaly detection
CN117390708A (en) * 2023-12-11 2024-01-12 南京向日葵大数据有限公司 Privacy data security protection method and system
CN117390708B (en) * 2023-12-11 2024-02-23 南京向日葵大数据有限公司 Privacy data security protection method and system

Also Published As

Publication number Publication date
CN110351307B (en) 2022-01-28

Similar Documents

Publication Publication Date Title
CN110351307A (en) Abnormal user detection method and system based on integrated study
Feng et al. A user-centric machine learning framework for cyber security operations center
US9479518B1 (en) Low false positive behavioral fraud detection
CN106790008B (en) Machine learning system for detecting abnormal host in enterprise network
US9680938B1 (en) System, method, and computer program product for tracking user activity during a logon session
CN110880075A (en) Employee departure tendency detection method
CN112804196A (en) Log data processing method and device
CN106778253A (en) Threat context aware information security Initiative Defense model based on big data
CN110138763A (en) A kind of inside threat detection system and method based on dynamic web browsing behavior
CN110990836B (en) Code leakage detection system and method based on natural language processing technology
Zhao et al. Analysis and design for intrusion detection system based on data mining
CN109450882A (en) A kind of security management and control system and method for the internet behavior merging artificial intelligence and big data
Wang et al. Localizing temporal anomalies in large evolving graphs
Sallam et al. Result-based detection of insider threats to relational databases
Lambert II Security analytics: Using deep learning to detect cyber attacks
CN107196942B (en) Internal threat detection method based on user language features
CN116865994A (en) Network data security prediction method based on big data
CN114282733A (en) Enterprise operation risk assessment method and system based on artificial intelligence analysis
CN116599765B (en) Honeypot deployment method
US20230396640A1 (en) Security event management system and associated method
EP2571225A1 (en) A method for detecting data misuse in an organization's network
US20220374524A1 (en) Method and system for anamoly detection in the banking system with graph neural networks (gnns)
Mondek et al. Security analytics in the big data era
CN115567241A (en) Multi-site network perception detection system
Sun et al. A real-time detection scheme of user behavior anomaly for management information system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant