CN110502883A - A kind of keystroke abnormal behavior detection method based on PCA - Google Patents

A kind of keystroke abnormal behavior detection method based on PCA Download PDF

Info

Publication number
CN110502883A
CN110502883A CN201910785323.9A CN201910785323A CN110502883A CN 110502883 A CN110502883 A CN 110502883A CN 201910785323 A CN201910785323 A CN 201910785323A CN 110502883 A CN110502883 A CN 110502883A
Authority
CN
China
Prior art keywords
data
keystroke
abnormal
pca
score
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910785323.9A
Other languages
Chinese (zh)
Other versions
CN110502883B (en
Inventor
刘录
常清雪
文有庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
Priority to CN201910785323.9A priority Critical patent/CN110502883B/en
Publication of CN110502883A publication Critical patent/CN110502883A/en
Application granted granted Critical
Publication of CN110502883B publication Critical patent/CN110502883B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3438Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/316User authentication by observing the pattern of computer usage, e.g. typical user behaviour

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • Software Systems (AREA)
  • Input From Keyboards Or The Like (AREA)
  • Collating Specific Patterns (AREA)

Abstract

The keystroke abnormal behavior detection method based on PCA that the invention discloses a kind of, comprising the following steps: A. collects the keystroke data of user, including user keystroke duration and keystroke interval time;B. keystroke data pre-processes, and handles missing data or format error data, then carries out centralization and normalization to keystroke data;C. PCA abnormality detection model is established, comprehensive abnormal score is obtained;D. abnormal score threshold is set, determines that the keystroke behavior is abnormal behaviour if detection sample exception score is greater than threshold value.Method of the invention, which can be realized, establishes model for normal users keystroke data, detects abnormal behaviour, does not need a large amount of data and does training, and PCA algorithm has calculation amount small, the simple feature of model.

Description

A kind of keystroke abnormal behavior detection method based on PCA
Technical field
The present invention relates to machine learning and technical field of network security, in particular to a kind of keystroke behavior based on PCA is different Normal detection method.
Background technique
Biometrics pass through everyone distinctive physiological characteristic such as fingerprint, palmmprint, face, iris etc. or behavioural characteristic Such as person's handwriting, voice carry out authentication.Since these features largely have uniqueness and non-imitability, greatly Ground reduces the risk that user is acted as fraudulent substitute for a person.With the mature of biometrics, had been obtained in many fields Successfully application, but the additional biological characteristic extract equipment due to needing higher cost, the popularization used by It restricts.Keystroke characteristic has compared apparent advantage with other biological methods, it is using keyboard as feature extracting device, as long as counting Calculation machine system is embedded in identification software, low in cost.Secondly, it ideally combines process of user login and verification process Together, it does not have any impact to user's use.
In research before, key stroke pattern identification mainly uses the methods of support vector machines, BP neural network.Support to Amount machine is by establishing two disaggregated models to normal users keystroke behavioral data and abnormal data, due to the keystroke in addition to user itself Outside data, other keystroke datas belong to abnormal data, which results in abnormal behaviour data class is various, can not receive comprehensively Collection, influences modelling effect, and support vector machines model computation complexity itself is higher.And keystroke abnormality detection is for each A user establishes a model, and when user volume is larger, supporting vector machine model has that space hold is excessive.BP nerve The problem of network algorithm model agrees to that there is also structure is complicated, and training process needs mass data, when the data volume of user It will affect the accuracy of model when insufficient.
Summary of the invention
It is insufficient in above-mentioned background technique the purpose of the present invention is overcoming, a kind of keystroke abnormal behavior inspection based on PCA is provided , it can be achieved that establishing model for normal users keystroke data, detection abnormal behaviour does not need a large amount of data and instructs survey method Practice, and PCA algorithm has calculation amount small, the simple feature of model.
In order to reach above-mentioned technical effect, the present invention takes following technical scheme:
A kind of keystroke abnormal behavior detection method based on PCA, comprising the following steps:
A. the keystroke data of user, including user keystroke duration and keystroke interval time are collected;
B. keystroke data pre-processes, and handles missing data or format error data, then carries out to keystroke data Centralization and normalization;Centralization and normalized are carried out to data, the purpose of centralization is that subsequent formula is allowed to describe more Succinctly, do not influence Eigenvalues Decomposition, normalization be in order to allow the variance of different variables to change scale and control in identical range, The influence of different dimensions is eliminated, so that they more have comparativity;
C. PCA abnormality detection model is established, comprehensive abnormal score is obtained;Keystroke abnormal behavior, which is established, by PCA detects mould Type, the feature vector that PCA is obtained after doing Eigenvalues Decomposition have reacted the different directions of initial data variance variation degree, special Value indicative is variance size of the data on corresponding direction, and therefore, the corresponding feature vector of maximum eigenvalue is that data variance is maximum Direction, the corresponding feature vector of minimal eigenvalue be the smallest direction of data variance, if individual data sample is with whole number The characteristics of showing according to sample is less consistent, such as larger with the deviation of other data samples in a certain direction, may mean that The data sample is an abnormal point;
D. abnormal score threshold is set, determines that the keystroke behavior is abnormal if detection sample exception score is greater than threshold value Behavior.
Further, in the step A, keystroke duration, that is, key lifts and the difference of the time by key pressing, specifically Calculation formula are as follows:Keystroke interval time calculation formula are as follows:Wherein,Indicate i-th of key pressing moment,Indicate that i-th of key lifts the moment, the calculating of keystroke interval time in this method In, the keystroke interval time of user takes the time difference between the latter key pressing and previous key pressing, can avoid user second The case where when previous key of a key pressing does not lift also.
Further, the step B is specifically to carry out the following processing to keystroke data:Wherein, x ' is Data that treated, x are initial data,For the mean value of initial data, σ is the standard deviation of initial data, to hit original It is 0 that mean value is obtained after bond number Data preprocess, and the data for the obedience standardized normal distribution that standard deviation is 1 are instructed as next step model Experienced data.
Further, the step C includes:
C1. the characteristic value and feature vector of keystroke data are solved, including calculates covariance matrixIt asks Solve the eigenvalue λ of covariance matrix1, λ2..., λmWith feature vector e1, e2..., em
C2. it constitutes transition matrix: according to the sequence of characteristic value from big to small, feature vector being arranged to make up spy from left to right Levy vector matrix P;
C3. dimensionality reduction is carried out to data X: including determining dropped dimension k and data conversion, wherein maximum eigenvalue is corresponding Feature vector is the maximum direction of data variance, and the direction most comprising raw information, the corresponding feature of minimal eigenvalue Vector is the smallest direction of data variance, and comprising the smallest direction of raw information, when information utilization is up to 99% or more, then Dropped dimension k is found out by following formula:
Feature vector P corresponding to k characteristic value before takingk;Then dimensionality reduction, conversion are carried out to X Data later are Y=XPk
C4. the abnormal score of data is calculated: for some feature vector ej, data sample XiDeviation in this direction Degree dijIt calculates are as follows:It, will be on all directions after calculating the departure degree of data in all directions Departure degree adds up, and obtains comprehensive abnormal score:Wherein, data are in difference Variance reacting condition on direction in it in feature, if the characteristics of individual data sample is shown with overall data sample is not Unanimously, the deviation of sample is larger, then identifies that the sample is an abnormal point.
It further, is specifically to calculate all abnormal scores of the user after obtaining user's exception score in the step D Mean value and standard deviation, and add the sum after 3 times of standard deviations as abnormal score threshold using mean value.
Compared with prior art, the present invention have it is below the utility model has the advantages that
Keystroke abnormal behavior detection method based on PCA of the invention establishes the detection of keystroke abnormal behavior by PCA algorithm Model is implemented without a large amount of data and does training, can effectively solve keystroke anomalous identification model to detect abnormal behaviour In problem more than computationally intensive, required data volume, have calculation amount small, the simple advantage of model.
Detailed description of the invention
Fig. 1 is that keystroke data of the invention extracts schematic diagram schematic diagram.
Specific embodiment
Below with reference to the embodiment of the present invention, the invention will be further elaborated.
Embodiment:
Embodiment one:
A kind of keystroke abnormal behavior detection method based on PCA, comprising the following steps:
Step 1: collecting the keystroke data of user, including user keystroke duration and keystroke interval time, keystroke data Extracting mode is as shown in Figure 1.
Key time durations are as follows:Wherein,Indicate i-th of key pressing moment,Table Show that i-th of key lifts the moment;
Keystroke interval time are as follows:Keystroke interval time be the latter key pressing with it is previous Time interval between key pressing.
Step 2: keystroke data pretreatment.
Including handling missing data or format error data, centralization and normalization then are carried out to data, made For the training data of following model, specific calculation are as follows:Wherein, x ' is treated data, and x is Initial data is the mean value of initial data, is the standard deviation of initial data, thus by obtaining after the pretreatment of original keystroke data Value is 0, the data for the obedience standardized normal distribution that standard deviation is 1, the data as next step model training.
Step 3: establishing PCA abnormality detection model.
The characteristic value and feature vector of step 3.1. solution data:
By calculating covariance matrixSolve the eigenvalue λ of covariance matrix1, λ2..., λmAnd spy Levy vector e1, e2..., em
Step 3.2. constitutes transition matrix: according to the sequence of characteristic value from big to small, feature vector being arranged from left to right Constitutive characteristic vector matrix P.
Step 3.3. carries out dimensionality reduction to data X:
Including first determining dropped dimension k, the corresponding feature vector of maximum eigenvalue is the maximum direction of data variance, and Comprising the most direction of raw information, the corresponding feature vector of minimal eigenvalue is the smallest direction of data variance, and comprising The smallest direction of raw information, when information utilization is up to 99% or more, it may be assumed thatK value is found out, k before taking Feature vector P corresponding to characteristic valuek;Then dimensionality reduction is carried out to X, the data after converting is Y=XPk
The abnormal score of step 3.4. calculating data:
For some feature vector ej, data sample XiDeparture degree d in this directionijIt calculates are as follows:After calculating the departure degree of data in all directions, the departure degree on all directions has been added Come, obtain comprehensive abnormal score:
The variance reacting condition of data in different directions in it in feature, if individual data sample is with overall data The characteristics of sample is shown is inconsistent, and the deviation of sample is larger, then it is an abnormal point that identification, which changes sample,.
Step 4: setting abnormal score threshold, then judge that the keystroke behavior is abnormal behaviour greater than threshold value.
Such as calculate the abnormal score of 20 keystroke behavioral datas of a user:
Score=[score0, score1..., score19]
Calculate the mean value x of abnormal score-With standard deviation sigma, it is that mean value adds 3 times of standard deviations that threshold value, which is arranged:
Judge that the keystroke behavior is abnormal behaviour if detection sample exception score is greater than threshold value.
It is understood that the principle that embodiment of above is intended to be merely illustrative of the present and the exemplary implementation that uses Mode, however the present invention is not limited thereto.For those skilled in the art, essence of the invention is not being departed from In the case where mind and essence, various changes and modifications can be made therein, these variations and modifications are also considered as protection scope of the present invention.

Claims (5)

1. a kind of keystroke abnormal behavior detection method based on PCA, which comprises the following steps:
A. the keystroke data of user, including user keystroke duration and keystroke interval time are collected;
B. keystroke data pre-processes, and handles missing data or format error data, then carries out center to keystroke data Change and normalizes;
C. PCA abnormality detection model is established, comprehensive abnormal score is obtained;
D. abnormal score threshold is set, determines that the keystroke behavior is abnormal row if detection sample exception score is greater than threshold value For.
2. a kind of keystroke abnormal behavior detection method based on PCA according to claim 1, which is characterized in that the step In rapid A, keystroke duration calculation formula are as follows:Keystroke interval time calculation formula are as follows:Wherein,Indicate i-th of key pressing moment,Indicate that i-th of key lifts the moment.
3. a kind of keystroke abnormal behavior detection method based on PCA according to claim 2, which is characterized in that the step Rapid B is specifically to carry out the following processing to keystroke data:Wherein, x ' is treated data, and x is original number According to,For the mean value of initial data, σ is the standard deviation of initial data, thus by mean value is obtained after the pretreatment of original keystroke data It is 0, the data for the obedience standardized normal distribution that standard deviation is 1, the data as next step model training.
4. a kind of keystroke abnormal behavior detection method based on PCA according to claim 3, which is characterized in that the step Suddenly C includes:
C1. the characteristic value and feature vector of keystroke data are solved, including calculates covariance matrixSolve association The eigenvalue λ of variance matrix1, λ2..., λmWith feature vector e1, e2..., em
C2. constitute transition matrix: according to the sequence of characteristic value from big to small, feature vector is arranged to make up from left to right feature to Moment matrix P;
C3. dimensionality reduction is carried out to data X: including determining dropped dimension k and data conversion, wherein the corresponding feature of maximum eigenvalue Vector is the maximum direction of data variance, and the corresponding feature vector of minimal eigenvalue is the smallest direction of data variance, institute's dimensionality reduction Number k is found out by following formula:Feature vector P corresponding to k characteristic value before takingk;Then to X into Row dimensionality reduction, the data after converting is Y=XPk
C4. the abnormal score of data is calculated: for some feature vector ej, data sample XiDeparture degree in this direction dijIt calculates are as follows:After calculating the departure degree of data in all directions, by the deviation on all directions Degree adds up, and obtains comprehensive abnormal score:
5. a kind of keystroke abnormal behavior detection method based on PCA according to claim 4, which is characterized in that the step It is specifically to calculate the mean value and standard deviation of all abnormal scores of the user, and using equal after obtaining user's exception score in rapid D Sum after value plus 3 times of standard deviations is as abnormal score threshold.
CN201910785323.9A 2019-08-23 2019-08-23 PCA-based keystroke behavior anomaly detection method Active CN110502883B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910785323.9A CN110502883B (en) 2019-08-23 2019-08-23 PCA-based keystroke behavior anomaly detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910785323.9A CN110502883B (en) 2019-08-23 2019-08-23 PCA-based keystroke behavior anomaly detection method

Publications (2)

Publication Number Publication Date
CN110502883A true CN110502883A (en) 2019-11-26
CN110502883B CN110502883B (en) 2022-08-19

Family

ID=68589339

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910785323.9A Active CN110502883B (en) 2019-08-23 2019-08-23 PCA-based keystroke behavior anomaly detection method

Country Status (1)

Country Link
CN (1) CN110502883B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984952A (en) * 2020-09-03 2020-11-24 四川长虹电器股份有限公司 HMM-based user input behavior abnormity identification method
CN114509690A (en) * 2022-04-19 2022-05-17 杭州宇谷科技有限公司 PCA (principal component analysis) decomposition-based lithium battery cell charging and discharging abnormity detection method and system

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101833619A (en) * 2010-04-29 2010-09-15 西安交通大学 Method for judging identity based on keyboard-mouse crossed certification
US20110320816A1 (en) * 2009-03-13 2011-12-29 Rutgers, The State University Of New Jersey Systems and method for malware detection
CN105389486A (en) * 2015-11-05 2016-03-09 同济大学 Authentication method based on mouse behavior
CN105933267A (en) * 2015-08-21 2016-09-07 ***股份有限公司 Identity authentication method and device
CN106101116A (en) * 2016-06-29 2016-11-09 东北大学 A kind of user behavior abnormality detection system based on principal component analysis and method
CN109145554A (en) * 2018-07-12 2019-01-04 温州大学苍南研究院 A kind of recognition methods of keystroke characteristic abnormal user and system based on support vector machines
CN109308306A (en) * 2018-09-29 2019-02-05 重庆大学 A kind of user power utilization anomaly detection method based on isolated forest
CN109377409A (en) * 2018-09-29 2019-02-22 重庆大学 A kind of user power utilization anomaly detection method based on BP neural network
CN109447099A (en) * 2018-08-28 2019-03-08 西安理工大学 A kind of Combining Multiple Classifiers based on PCA dimensionality reduction
CN109815655A (en) * 2017-11-22 2019-05-28 北京纳米能源与***研究所 Identification and verifying system, method, apparatus and computer readable storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110320816A1 (en) * 2009-03-13 2011-12-29 Rutgers, The State University Of New Jersey Systems and method for malware detection
CN101833619A (en) * 2010-04-29 2010-09-15 西安交通大学 Method for judging identity based on keyboard-mouse crossed certification
CN105933267A (en) * 2015-08-21 2016-09-07 ***股份有限公司 Identity authentication method and device
CN105389486A (en) * 2015-11-05 2016-03-09 同济大学 Authentication method based on mouse behavior
CN106101116A (en) * 2016-06-29 2016-11-09 东北大学 A kind of user behavior abnormality detection system based on principal component analysis and method
CN109815655A (en) * 2017-11-22 2019-05-28 北京纳米能源与***研究所 Identification and verifying system, method, apparatus and computer readable storage medium
CN109145554A (en) * 2018-07-12 2019-01-04 温州大学苍南研究院 A kind of recognition methods of keystroke characteristic abnormal user and system based on support vector machines
CN109447099A (en) * 2018-08-28 2019-03-08 西安理工大学 A kind of Combining Multiple Classifiers based on PCA dimensionality reduction
CN109308306A (en) * 2018-09-29 2019-02-05 重庆大学 A kind of user power utilization anomaly detection method based on isolated forest
CN109377409A (en) * 2018-09-29 2019-02-22 重庆大学 A kind of user power utilization anomaly detection method based on BP neural network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
IGNACIO DE MENDIZABAL-VA´ZQUEZ 等: "Supervised classification methods applied to Keystroke Dynamics through Mobile Devices", 《2014 INTERNATIONAL CARNAHAN CONFERENCE ON SECURITY TECHNOLOGY (ICCST)》 *
吴梦溪: "基于输入特征的用户身份认证的研究", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *
王焘 等: "一种基于自适应监测的云计算***故障检测方法", 《计算机学报》 *
郭志民 等: "基于用户与网络行为分析的主机异常检测方法", 《北京交通大学学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984952A (en) * 2020-09-03 2020-11-24 四川长虹电器股份有限公司 HMM-based user input behavior abnormity identification method
CN114509690A (en) * 2022-04-19 2022-05-17 杭州宇谷科技有限公司 PCA (principal component analysis) decomposition-based lithium battery cell charging and discharging abnormity detection method and system

Also Published As

Publication number Publication date
CN110502883B (en) 2022-08-19

Similar Documents

Publication Publication Date Title
CN106326886B (en) Finger vein image quality appraisal procedure based on convolutional neural networks
Sanchez-Reillo et al. Biometric identification through hand geometry measurements
Raghavendra et al. Designing efficient fusion schemes for multimodal biometric systems using face and palmprint
CN101226590B (en) Method for recognizing human face
CN111144522B (en) Power grid NFC equipment fingerprint authentication method based on hardware intrinsic difference
CN113489685B (en) Secondary feature extraction and malicious attack identification method based on kernel principal component analysis
CN111476222B (en) Image processing method, image processing device, computer equipment and computer readable storage medium
CN111625792B (en) Identity recognition method based on abnormal behavior detection
Karnan et al. Bio password—keystroke dynamic approach to secure mobile devices
CN106991312B (en) Internet anti-fraud authentication method based on voiceprint recognition
CN111625789B (en) User identification method based on multi-core learning fusion of mouse and keyboard behavior characteristics
CN109190698B (en) Classification and identification system and method for network digital virtual assets
CN110008674A (en) A kind of electrocardiosignal identity identifying method of high generalization
CN110276189B (en) User identity authentication method based on gait information
CN107944356A (en) The identity identifying method of the hierarchical subject model palmprint image identification of comprehensive polymorphic type feature
CN110502883A (en) A kind of keystroke abnormal behavior detection method based on PCA
CN111160424A (en) NFC equipment fingerprint authentication method and system based on CNN image identification
CN108256449A (en) A kind of Human bodys' response method based on subspace grader
CN107026928A (en) A kind of behavioural characteristic identification authentication method and device based on mobile phone sensor
TWI325568B (en) A method for face varification
CN110378414B (en) Multi-mode biological characteristic fusion identity recognition method based on evolution strategy
CN103207993A (en) Face recognition method based on nuclear distinguishing random neighbor embedding analysis
CN113673343B (en) Open set palmprint recognition system and method based on weighting element measurement learning
CN114840834A (en) Implicit identity authentication method based on gait characteristics
CN114021181A (en) Mobile intelligent terminal privacy continuous protection system and method based on use habits

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant