CN110502883A - A kind of keystroke abnormal behavior detection method based on PCA - Google Patents
A kind of keystroke abnormal behavior detection method based on PCA Download PDFInfo
- Publication number
- CN110502883A CN110502883A CN201910785323.9A CN201910785323A CN110502883A CN 110502883 A CN110502883 A CN 110502883A CN 201910785323 A CN201910785323 A CN 201910785323A CN 110502883 A CN110502883 A CN 110502883A
- Authority
- CN
- China
- Prior art keywords
- data
- keystroke
- abnormal
- pca
- score
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3438—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
- G06F11/3476—Data logging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/316—User authentication by observing the pattern of computer usage, e.g. typical user behaviour
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Computer Security & Cryptography (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Social Psychology (AREA)
- Software Systems (AREA)
- Input From Keyboards Or The Like (AREA)
- Collating Specific Patterns (AREA)
Abstract
The keystroke abnormal behavior detection method based on PCA that the invention discloses a kind of, comprising the following steps: A. collects the keystroke data of user, including user keystroke duration and keystroke interval time;B. keystroke data pre-processes, and handles missing data or format error data, then carries out centralization and normalization to keystroke data;C. PCA abnormality detection model is established, comprehensive abnormal score is obtained;D. abnormal score threshold is set, determines that the keystroke behavior is abnormal behaviour if detection sample exception score is greater than threshold value.Method of the invention, which can be realized, establishes model for normal users keystroke data, detects abnormal behaviour, does not need a large amount of data and does training, and PCA algorithm has calculation amount small, the simple feature of model.
Description
Technical field
The present invention relates to machine learning and technical field of network security, in particular to a kind of keystroke behavior based on PCA is different
Normal detection method.
Background technique
Biometrics pass through everyone distinctive physiological characteristic such as fingerprint, palmmprint, face, iris etc. or behavioural characteristic
Such as person's handwriting, voice carry out authentication.Since these features largely have uniqueness and non-imitability, greatly
Ground reduces the risk that user is acted as fraudulent substitute for a person.With the mature of biometrics, had been obtained in many fields
Successfully application, but the additional biological characteristic extract equipment due to needing higher cost, the popularization used by
It restricts.Keystroke characteristic has compared apparent advantage with other biological methods, it is using keyboard as feature extracting device, as long as counting
Calculation machine system is embedded in identification software, low in cost.Secondly, it ideally combines process of user login and verification process
Together, it does not have any impact to user's use.
In research before, key stroke pattern identification mainly uses the methods of support vector machines, BP neural network.Support to
Amount machine is by establishing two disaggregated models to normal users keystroke behavioral data and abnormal data, due to the keystroke in addition to user itself
Outside data, other keystroke datas belong to abnormal data, which results in abnormal behaviour data class is various, can not receive comprehensively
Collection, influences modelling effect, and support vector machines model computation complexity itself is higher.And keystroke abnormality detection is for each
A user establishes a model, and when user volume is larger, supporting vector machine model has that space hold is excessive.BP nerve
The problem of network algorithm model agrees to that there is also structure is complicated, and training process needs mass data, when the data volume of user
It will affect the accuracy of model when insufficient.
Summary of the invention
It is insufficient in above-mentioned background technique the purpose of the present invention is overcoming, a kind of keystroke abnormal behavior inspection based on PCA is provided
, it can be achieved that establishing model for normal users keystroke data, detection abnormal behaviour does not need a large amount of data and instructs survey method
Practice, and PCA algorithm has calculation amount small, the simple feature of model.
In order to reach above-mentioned technical effect, the present invention takes following technical scheme:
A kind of keystroke abnormal behavior detection method based on PCA, comprising the following steps:
A. the keystroke data of user, including user keystroke duration and keystroke interval time are collected;
B. keystroke data pre-processes, and handles missing data or format error data, then carries out to keystroke data
Centralization and normalization;Centralization and normalized are carried out to data, the purpose of centralization is that subsequent formula is allowed to describe more
Succinctly, do not influence Eigenvalues Decomposition, normalization be in order to allow the variance of different variables to change scale and control in identical range,
The influence of different dimensions is eliminated, so that they more have comparativity;
C. PCA abnormality detection model is established, comprehensive abnormal score is obtained;Keystroke abnormal behavior, which is established, by PCA detects mould
Type, the feature vector that PCA is obtained after doing Eigenvalues Decomposition have reacted the different directions of initial data variance variation degree, special
Value indicative is variance size of the data on corresponding direction, and therefore, the corresponding feature vector of maximum eigenvalue is that data variance is maximum
Direction, the corresponding feature vector of minimal eigenvalue be the smallest direction of data variance, if individual data sample is with whole number
The characteristics of showing according to sample is less consistent, such as larger with the deviation of other data samples in a certain direction, may mean that
The data sample is an abnormal point;
D. abnormal score threshold is set, determines that the keystroke behavior is abnormal if detection sample exception score is greater than threshold value
Behavior.
Further, in the step A, keystroke duration, that is, key lifts and the difference of the time by key pressing, specifically
Calculation formula are as follows:Keystroke interval time calculation formula are as follows:Wherein,Indicate i-th of key pressing moment,Indicate that i-th of key lifts the moment, the calculating of keystroke interval time in this method
In, the keystroke interval time of user takes the time difference between the latter key pressing and previous key pressing, can avoid user second
The case where when previous key of a key pressing does not lift also.
Further, the step B is specifically to carry out the following processing to keystroke data:Wherein, x ' is
Data that treated, x are initial data,For the mean value of initial data, σ is the standard deviation of initial data, to hit original
It is 0 that mean value is obtained after bond number Data preprocess, and the data for the obedience standardized normal distribution that standard deviation is 1 are instructed as next step model
Experienced data.
Further, the step C includes:
C1. the characteristic value and feature vector of keystroke data are solved, including calculates covariance matrixIt asks
Solve the eigenvalue λ of covariance matrix1, λ2..., λmWith feature vector e1, e2..., em;
C2. it constitutes transition matrix: according to the sequence of characteristic value from big to small, feature vector being arranged to make up spy from left to right
Levy vector matrix P;
C3. dimensionality reduction is carried out to data X: including determining dropped dimension k and data conversion, wherein maximum eigenvalue is corresponding
Feature vector is the maximum direction of data variance, and the direction most comprising raw information, the corresponding feature of minimal eigenvalue
Vector is the smallest direction of data variance, and comprising the smallest direction of raw information, when information utilization is up to 99% or more, then
Dropped dimension k is found out by following formula:
Feature vector P corresponding to k characteristic value before takingk;Then dimensionality reduction, conversion are carried out to X
Data later are Y=XPk;
C4. the abnormal score of data is calculated: for some feature vector ej, data sample XiDeviation in this direction
Degree dijIt calculates are as follows:It, will be on all directions after calculating the departure degree of data in all directions
Departure degree adds up, and obtains comprehensive abnormal score:Wherein, data are in difference
Variance reacting condition on direction in it in feature, if the characteristics of individual data sample is shown with overall data sample is not
Unanimously, the deviation of sample is larger, then identifies that the sample is an abnormal point.
It further, is specifically to calculate all abnormal scores of the user after obtaining user's exception score in the step D
Mean value and standard deviation, and add the sum after 3 times of standard deviations as abnormal score threshold using mean value.
Compared with prior art, the present invention have it is below the utility model has the advantages that
Keystroke abnormal behavior detection method based on PCA of the invention establishes the detection of keystroke abnormal behavior by PCA algorithm
Model is implemented without a large amount of data and does training, can effectively solve keystroke anomalous identification model to detect abnormal behaviour
In problem more than computationally intensive, required data volume, have calculation amount small, the simple advantage of model.
Detailed description of the invention
Fig. 1 is that keystroke data of the invention extracts schematic diagram schematic diagram.
Specific embodiment
Below with reference to the embodiment of the present invention, the invention will be further elaborated.
Embodiment:
Embodiment one:
A kind of keystroke abnormal behavior detection method based on PCA, comprising the following steps:
Step 1: collecting the keystroke data of user, including user keystroke duration and keystroke interval time, keystroke data
Extracting mode is as shown in Figure 1.
Key time durations are as follows:Wherein,Indicate i-th of key pressing moment,Table
Show that i-th of key lifts the moment;
Keystroke interval time are as follows:Keystroke interval time be the latter key pressing with it is previous
Time interval between key pressing.
Step 2: keystroke data pretreatment.
Including handling missing data or format error data, centralization and normalization then are carried out to data, made
For the training data of following model, specific calculation are as follows:Wherein, x ' is treated data, and x is
Initial data is the mean value of initial data, is the standard deviation of initial data, thus by obtaining after the pretreatment of original keystroke data
Value is 0, the data for the obedience standardized normal distribution that standard deviation is 1, the data as next step model training.
Step 3: establishing PCA abnormality detection model.
The characteristic value and feature vector of step 3.1. solution data:
By calculating covariance matrixSolve the eigenvalue λ of covariance matrix1, λ2..., λmAnd spy
Levy vector e1, e2..., em。
Step 3.2. constitutes transition matrix: according to the sequence of characteristic value from big to small, feature vector being arranged from left to right
Constitutive characteristic vector matrix P.
Step 3.3. carries out dimensionality reduction to data X:
Including first determining dropped dimension k, the corresponding feature vector of maximum eigenvalue is the maximum direction of data variance, and
Comprising the most direction of raw information, the corresponding feature vector of minimal eigenvalue is the smallest direction of data variance, and comprising
The smallest direction of raw information, when information utilization is up to 99% or more, it may be assumed thatK value is found out, k before taking
Feature vector P corresponding to characteristic valuek;Then dimensionality reduction is carried out to X, the data after converting is Y=XPk。
The abnormal score of step 3.4. calculating data:
For some feature vector ej, data sample XiDeparture degree d in this directionijIt calculates are as follows:After calculating the departure degree of data in all directions, the departure degree on all directions has been added
Come, obtain comprehensive abnormal score:
The variance reacting condition of data in different directions in it in feature, if individual data sample is with overall data
The characteristics of sample is shown is inconsistent, and the deviation of sample is larger, then it is an abnormal point that identification, which changes sample,.
Step 4: setting abnormal score threshold, then judge that the keystroke behavior is abnormal behaviour greater than threshold value.
Such as calculate the abnormal score of 20 keystroke behavioral datas of a user:
Score=[score0, score1..., score19]
Calculate the mean value x of abnormal score-With standard deviation sigma, it is that mean value adds 3 times of standard deviations that threshold value, which is arranged:
Judge that the keystroke behavior is abnormal behaviour if detection sample exception score is greater than threshold value.
It is understood that the principle that embodiment of above is intended to be merely illustrative of the present and the exemplary implementation that uses
Mode, however the present invention is not limited thereto.For those skilled in the art, essence of the invention is not being departed from
In the case where mind and essence, various changes and modifications can be made therein, these variations and modifications are also considered as protection scope of the present invention.
Claims (5)
1. a kind of keystroke abnormal behavior detection method based on PCA, which comprises the following steps:
A. the keystroke data of user, including user keystroke duration and keystroke interval time are collected;
B. keystroke data pre-processes, and handles missing data or format error data, then carries out center to keystroke data
Change and normalizes;
C. PCA abnormality detection model is established, comprehensive abnormal score is obtained;
D. abnormal score threshold is set, determines that the keystroke behavior is abnormal row if detection sample exception score is greater than threshold value
For.
2. a kind of keystroke abnormal behavior detection method based on PCA according to claim 1, which is characterized in that the step
In rapid A, keystroke duration calculation formula are as follows:Keystroke interval time calculation formula are as follows:Wherein,Indicate i-th of key pressing moment,Indicate that i-th of key lifts the moment.
3. a kind of keystroke abnormal behavior detection method based on PCA according to claim 2, which is characterized in that the step
Rapid B is specifically to carry out the following processing to keystroke data:Wherein, x ' is treated data, and x is original number
According to,For the mean value of initial data, σ is the standard deviation of initial data, thus by mean value is obtained after the pretreatment of original keystroke data
It is 0, the data for the obedience standardized normal distribution that standard deviation is 1, the data as next step model training.
4. a kind of keystroke abnormal behavior detection method based on PCA according to claim 3, which is characterized in that the step
Suddenly C includes:
C1. the characteristic value and feature vector of keystroke data are solved, including calculates covariance matrixSolve association
The eigenvalue λ of variance matrix1, λ2..., λmWith feature vector e1, e2..., em;
C2. constitute transition matrix: according to the sequence of characteristic value from big to small, feature vector is arranged to make up from left to right feature to
Moment matrix P;
C3. dimensionality reduction is carried out to data X: including determining dropped dimension k and data conversion, wherein the corresponding feature of maximum eigenvalue
Vector is the maximum direction of data variance, and the corresponding feature vector of minimal eigenvalue is the smallest direction of data variance, institute's dimensionality reduction
Number k is found out by following formula:Feature vector P corresponding to k characteristic value before takingk;Then to X into
Row dimensionality reduction, the data after converting is Y=XPk;
C4. the abnormal score of data is calculated: for some feature vector ej, data sample XiDeparture degree in this direction
dijIt calculates are as follows:After calculating the departure degree of data in all directions, by the deviation on all directions
Degree adds up, and obtains comprehensive abnormal score:
5. a kind of keystroke abnormal behavior detection method based on PCA according to claim 4, which is characterized in that the step
It is specifically to calculate the mean value and standard deviation of all abnormal scores of the user, and using equal after obtaining user's exception score in rapid D
Sum after value plus 3 times of standard deviations is as abnormal score threshold.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910785323.9A CN110502883B (en) | 2019-08-23 | 2019-08-23 | PCA-based keystroke behavior anomaly detection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910785323.9A CN110502883B (en) | 2019-08-23 | 2019-08-23 | PCA-based keystroke behavior anomaly detection method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110502883A true CN110502883A (en) | 2019-11-26 |
CN110502883B CN110502883B (en) | 2022-08-19 |
Family
ID=68589339
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910785323.9A Active CN110502883B (en) | 2019-08-23 | 2019-08-23 | PCA-based keystroke behavior anomaly detection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110502883B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111984952A (en) * | 2020-09-03 | 2020-11-24 | 四川长虹电器股份有限公司 | HMM-based user input behavior abnormity identification method |
CN114509690A (en) * | 2022-04-19 | 2022-05-17 | 杭州宇谷科技有限公司 | PCA (principal component analysis) decomposition-based lithium battery cell charging and discharging abnormity detection method and system |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101833619A (en) * | 2010-04-29 | 2010-09-15 | 西安交通大学 | Method for judging identity based on keyboard-mouse crossed certification |
US20110320816A1 (en) * | 2009-03-13 | 2011-12-29 | Rutgers, The State University Of New Jersey | Systems and method for malware detection |
CN105389486A (en) * | 2015-11-05 | 2016-03-09 | 同济大学 | Authentication method based on mouse behavior |
CN105933267A (en) * | 2015-08-21 | 2016-09-07 | ***股份有限公司 | Identity authentication method and device |
CN106101116A (en) * | 2016-06-29 | 2016-11-09 | 东北大学 | A kind of user behavior abnormality detection system based on principal component analysis and method |
CN109145554A (en) * | 2018-07-12 | 2019-01-04 | 温州大学苍南研究院 | A kind of recognition methods of keystroke characteristic abnormal user and system based on support vector machines |
CN109308306A (en) * | 2018-09-29 | 2019-02-05 | 重庆大学 | A kind of user power utilization anomaly detection method based on isolated forest |
CN109377409A (en) * | 2018-09-29 | 2019-02-22 | 重庆大学 | A kind of user power utilization anomaly detection method based on BP neural network |
CN109447099A (en) * | 2018-08-28 | 2019-03-08 | 西安理工大学 | A kind of Combining Multiple Classifiers based on PCA dimensionality reduction |
CN109815655A (en) * | 2017-11-22 | 2019-05-28 | 北京纳米能源与***研究所 | Identification and verifying system, method, apparatus and computer readable storage medium |
-
2019
- 2019-08-23 CN CN201910785323.9A patent/CN110502883B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110320816A1 (en) * | 2009-03-13 | 2011-12-29 | Rutgers, The State University Of New Jersey | Systems and method for malware detection |
CN101833619A (en) * | 2010-04-29 | 2010-09-15 | 西安交通大学 | Method for judging identity based on keyboard-mouse crossed certification |
CN105933267A (en) * | 2015-08-21 | 2016-09-07 | ***股份有限公司 | Identity authentication method and device |
CN105389486A (en) * | 2015-11-05 | 2016-03-09 | 同济大学 | Authentication method based on mouse behavior |
CN106101116A (en) * | 2016-06-29 | 2016-11-09 | 东北大学 | A kind of user behavior abnormality detection system based on principal component analysis and method |
CN109815655A (en) * | 2017-11-22 | 2019-05-28 | 北京纳米能源与***研究所 | Identification and verifying system, method, apparatus and computer readable storage medium |
CN109145554A (en) * | 2018-07-12 | 2019-01-04 | 温州大学苍南研究院 | A kind of recognition methods of keystroke characteristic abnormal user and system based on support vector machines |
CN109447099A (en) * | 2018-08-28 | 2019-03-08 | 西安理工大学 | A kind of Combining Multiple Classifiers based on PCA dimensionality reduction |
CN109308306A (en) * | 2018-09-29 | 2019-02-05 | 重庆大学 | A kind of user power utilization anomaly detection method based on isolated forest |
CN109377409A (en) * | 2018-09-29 | 2019-02-22 | 重庆大学 | A kind of user power utilization anomaly detection method based on BP neural network |
Non-Patent Citations (4)
Title |
---|
IGNACIO DE MENDIZABAL-VA´ZQUEZ 等: "Supervised classification methods applied to Keystroke Dynamics through Mobile Devices", 《2014 INTERNATIONAL CARNAHAN CONFERENCE ON SECURITY TECHNOLOGY (ICCST)》 * |
吴梦溪: "基于输入特征的用户身份认证的研究", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 * |
王焘 等: "一种基于自适应监测的云计算***故障检测方法", 《计算机学报》 * |
郭志民 等: "基于用户与网络行为分析的主机异常检测方法", 《北京交通大学学报》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111984952A (en) * | 2020-09-03 | 2020-11-24 | 四川长虹电器股份有限公司 | HMM-based user input behavior abnormity identification method |
CN114509690A (en) * | 2022-04-19 | 2022-05-17 | 杭州宇谷科技有限公司 | PCA (principal component analysis) decomposition-based lithium battery cell charging and discharging abnormity detection method and system |
Also Published As
Publication number | Publication date |
---|---|
CN110502883B (en) | 2022-08-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106326886B (en) | Finger vein image quality appraisal procedure based on convolutional neural networks | |
Sanchez-Reillo et al. | Biometric identification through hand geometry measurements | |
Raghavendra et al. | Designing efficient fusion schemes for multimodal biometric systems using face and palmprint | |
CN101226590B (en) | Method for recognizing human face | |
CN111144522B (en) | Power grid NFC equipment fingerprint authentication method based on hardware intrinsic difference | |
CN113489685B (en) | Secondary feature extraction and malicious attack identification method based on kernel principal component analysis | |
CN111476222B (en) | Image processing method, image processing device, computer equipment and computer readable storage medium | |
CN111625792B (en) | Identity recognition method based on abnormal behavior detection | |
Karnan et al. | Bio password—keystroke dynamic approach to secure mobile devices | |
CN106991312B (en) | Internet anti-fraud authentication method based on voiceprint recognition | |
CN111625789B (en) | User identification method based on multi-core learning fusion of mouse and keyboard behavior characteristics | |
CN109190698B (en) | Classification and identification system and method for network digital virtual assets | |
CN110008674A (en) | A kind of electrocardiosignal identity identifying method of high generalization | |
CN110276189B (en) | User identity authentication method based on gait information | |
CN107944356A (en) | The identity identifying method of the hierarchical subject model palmprint image identification of comprehensive polymorphic type feature | |
CN110502883A (en) | A kind of keystroke abnormal behavior detection method based on PCA | |
CN111160424A (en) | NFC equipment fingerprint authentication method and system based on CNN image identification | |
CN108256449A (en) | A kind of Human bodys' response method based on subspace grader | |
CN107026928A (en) | A kind of behavioural characteristic identification authentication method and device based on mobile phone sensor | |
TWI325568B (en) | A method for face varification | |
CN110378414B (en) | Multi-mode biological characteristic fusion identity recognition method based on evolution strategy | |
CN103207993A (en) | Face recognition method based on nuclear distinguishing random neighbor embedding analysis | |
CN113673343B (en) | Open set palmprint recognition system and method based on weighting element measurement learning | |
CN114840834A (en) | Implicit identity authentication method based on gait characteristics | |
CN114021181A (en) | Mobile intelligent terminal privacy continuous protection system and method based on use habits |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |