CN110177112B - Network intrusion detection method based on double subspace sampling and confidence offset - Google Patents
Network intrusion detection method based on double subspace sampling and confidence offset Download PDFInfo
- Publication number
- CN110177112B CN110177112B CN201910490598.XA CN201910490598A CN110177112B CN 110177112 B CN110177112 B CN 110177112B CN 201910490598 A CN201910490598 A CN 201910490598A CN 110177112 B CN110177112 B CN 110177112B
- Authority
- CN
- China
- Prior art keywords
- layer
- sample
- confidence
- model
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Complex Calculations (AREA)
Abstract
The invention provides a network intrusion detection method based on double subspace sampling and confidence offset, which comprises the following steps of firstly, carrying out down-sampling pretreatment of a sample and feature double layer on a base classifier of each layer; secondly, the confidence of each layer is mixed with the original characteristics by an interpolation method to be used as new characteristics to be input into the next layer of the model; and then, disturbing the confidence coefficient of the interpolation layer by layer through a cascade model. In the test step, the perturbation of the confidence will not participate. Compared with the traditional unbalanced classification integration method, the method has the advantages that the unbalanced problem is expanded by the deep forest, and the threshold problem in unbalanced classification is further solved through a cascade structure; the system generates a model with selectable disturbance amplitude to train the sample, so that the detection performance of the model for unbalanced network intrusion can be effectively improved; meanwhile, the integrated model stacked layer by layer can obtain more excellent generalization performance in the detection process.
Description
Technical Field
The invention relates to an unbalanced network intrusion detection and identification method, belonging to the field of network information security
Background
With the rapid development of network technology and the gradual expansion of the scale of the internet, the network security problem gradually goes into the public sight. The research related to the network intrusion identification method is also a popular field in the year. The basic main network attack types include Denial of Service (DoS), unauthorized Remote host access (R2L), unauthorized super User access (User-to-Root, U2R), and snooping detection (Probing), and the above attack methods can further derive numerous sub-attack methods. It is therefore imperative to construct a targeted detection scheme for these network attacks.
The existing common network attack detection method comprises the following steps: 1) the rule-based detection method has high dependency on the existing rule database, cannot update new network attack means in time and is easy to cause huge loss; 2) the detection method depends on the network flow characteristic distribution, but the detection method excessively depends on the randomness, and partial attack means can be skillfully avoided; 3) intrusion detection methods based on machine learning, for example using support vector machines, random forests, neural networks, etc. The unknown network attack can be effectively and timely responded by using the machine learning-based method. But limited by different physical conditions and environmental restrictions, the number of network intrusions tends to be unbalanced in category, but the traditional machine learning method is difficult to solve the unbalanced type of network intrusions.
The unbalanced network intrusion detection problem can be effectively solved by using ensemble learning and combining with data sampling. These sampling-based integration methods can be further classified into bagging integration, boosting integration and hybrid integration strategies according to different integration strategies. These integration areas already have a number of representative algorithms. On the other hand, algorithms of the stochastic feature subspace are proposed to avoid underestimating some implicit important features and to filter some possible noisy features. The related algorithm is combined with a bagging integration strategy and a base classifier such as an SVM, and a representative algorithm ABRS-SVM and the like are provided.
The Zhou Shihua teacher provides a deep forest integration algorithm in 2017, and can compete with deep learning classification performance by using fewer hyper-parameters and lighter models. Meanwhile, the ideal classification effect can be achieved on a small number of data sets. The idea of the cascade forest is a design idea of model stacking, and generalization performance of the algorithm can be effectively improved.
However, the above ensemble learning method does not solve the threshold problem in the imbalance problem well, which results in that the classification performance cannot achieve the ideal effect when facing a data set with a high imbalance rate, such as network intrusion. And the cascade forest is used as a novel integrated model and has no optimization strategy aiming at the imbalance problem. Therefore, a new cascaded integration model is needed to promote the cascaded forest to the unbalanced problem and effectively solve the threshold problem in the existing unbalanced network intrusion detection.
Disclosure of Invention
Aiming at the problems that the existing integration algorithm can not effectively solve unbalanced network intrusion detection, the integration scale can not be well determined and modeling can only be carried out by experience, the invention provides a network intrusion detection method based on double subspace sampling and confidence offset by popularizing the characteristics of cascade forests. The integrated model effectively utilizes the model stacking structure of the cascade forest to adjust the classification threshold value of the imbalance problem layer by layer; and double downsampling data preprocessing is introduced to enable the model to effectively solve the problem of unbalanced network intrusion detection; and the verification mechanism can well control the scale of the model.
The technical scheme adopted by the invention for solving the technical problems is as follows: firstly, converting an acquired sample into a vector model which can be processed by the system according to specific problem description by a background, and carrying out one-hot coding on discrete features; secondly, the integrated model optimizes the classification performance of the unbalance problem through a double down-sampling strategy on a sample level and a feature level; according to the output confidence coefficient of the base classifier of the previous layer, carrying out feature disturbance on the base classifier of the previous layer, and mixing the base classifier with the original features to be used as the input of a model of the next layer; and a verification mechanism is added into the cascade model, so that the layer number can be adaptively stopped from increasing. In the testing process, data is substituted into the cascade model trained before, confidence offset is not needed, and the output of the last layer is used as a final result.
The technical scheme adopted by the invention for solving the technical problem can be further refined. In the first stage of the training step, the base classifier trained by each layer is a random forest and naive Bayes of a classical algorithm. The base classifier can be expanded more, and only the 2 types are selected as the base classifier in the experiment in consideration of the interpretability of the problem and the realization difficulty of the method. Meanwhile, in the testing and verifying process, the average accuracy of the majority classes and the minority classes is used as an evaluation index to objectively express the performance of the algorithm.
The invention has the beneficial effects that: the method comprises the steps of (1) designing a cascade integration model to popularize a cascade forest to the unbalanced field; the disturbance amplitude is controlled by adjusting the hyper-parameter eta, so that the model can effectively solve the classification problem in unbalanced classification.
Drawings
FIG. 1 is an overall flow chart of the present invention.
Detailed Description
The invention will be further described with reference to the following figures and examples: the system designed by the invention is divided into four modules.
A first part: data acquisition
In the data acquisition process, real sample data is transformed, and a data set represented by a vector is generated to facilitate the processing of a subsequent module. In this step, the collected sample is divided into a training sample and a test sample. The training samples are processed first. Generating a vector from a training sampleWherein i represents that the sample is the ith of the total training sample, and c represents that the sample belongs to the c-th class. Each element of the vector corresponds to an attribute of a sample, and the dimension d of the vector is the number of attributes of the sample. To facilitate subsequent calculation, all training samples are combined into a training matrix X0In the matrix, each row is a sample, where the subscript 0 denotes X0Is the initial input.
A second part: training classification model
In this block, the training sample matrix X generated by the previous block0Will be substituted into the inventive core algorithm for training. The method mainly comprises the following steps:
1) the base classifier used in the integration model is random forest and naive Bayes: in a random forest, a CART tree is used as a sub-classifier, and k features are randomly selected from d features to participate in the discrimination of Gini indexes every time leaf nodes of the CART are split, wherein k is generallyGini index is calculated as follows
WhereinRepresenting k feature subspaces FkWherein the ith feature and v represent featuresIs given by the value v, pyIndicating the scale of the class y samples. The lower the Gini index, the better the classification performance of the feature. Naive bayes can be viewed as the simplest bayesian network classifier. There is a conditional independence assumption in naive Bayes with a decision equation of
Where P (y) represents the prior probability of the class y, P (x)iY) then represents the conditional probability of the feature i in category y. Both random forests and naive bayes cannot reasonably deal with the imbalance problem because they are optimized on a global basis.
2) Training a random forest or naive Bayes base classifier of each layer by using a random down-sampling strategy based on sample and feature double layers: hypothesis training set XFThe total number of samples is N, wherein the number of the minority class samples is NpThe number of majority samples is Nn. In the double random down-sampling strategy, first, a majority class N 'equal to a minority class is randomly selected without being returned in a sample set'n=NpWhile all minority classes are involved in training; then, for the feature space F, a different feature subspace F '(F' e F) is selected for training. This not only reduces the effect of unbalanced sample ratios, but also effectively filters the negative effects of some undesirable features. The specific algorithm steps are as follows
Where S and E are the number of integration times of the sample and feature sample, respectively, δ is the feature sampling rate, and RUS is the majority class with random downsampling equal to the minority class.
3) And (3) according to the output confidence coefficient of the last layer of base classifier, performing characteristic disturbance on the base classifier, and mixing the base classifier with the original characteristics to be used as the input of the next layer of model: the base classifier used by the cascade interpolation integration model is random forest and naive Bayes, wherein the confidence coefficient calculation mode of the Random Forest (RF) is
Can be intuitively understood as the average of the sample proportions of the belonged category y' in the leaf nodes. The confidence degree of the Naive Bayes (NB) is calculated in the way of
Representing the posterior probability of the category y'. To prevent overfitting, the base classifiers inside each layer are cross-validated by 3-fold to generate confidence. The confidence offset procedure through which the resulting confidence vector V passes is as follows
V′l(i,ymajority)=Vl(i,ymajority)×η
V′l(i,yminority)=Vi(i,yminority)/η,
Wherein eta is a hyper-parameter, the general value range is the neighborhood of 1, and the value range in the experiment is {0.85,0.9,0.95,1,1.05,1.1,1.15 }. It is clear that the confidence offset process is negligible when η is 1. From the above equation, the confidence for the majority classes is multiplied by the parameter η, and the confidence for the minority classes is divided by η. The bias weight of the majority class/minority class is dynamically adjusted layer by layer through the disturbance of the confidence level. Finally, the disturbed feature V' and the original feature are mixed and interpolated to be used as the input of the next layer model
Wherein X0For the original feature, l is the current layer number, the sample number is m, the dimension is d, and the dimension of the interpolation confidence coefficient is NclassI.e. the number of categories.
4) A verification mechanism is added into a cascade model, so that the number of layers can be adaptively stopped from increasing, and the method is specifically realized as follows: the number of layers of the cascade interpolation model is limited by 2, and in the experiment, the maximum number of layers cannot exceed 5; second, each layer will perform a validation process with all layers before and after training is complete. Since the prior training was done by cross-validation, the validation process becomes more convincing. Here, the average accuracy (M-ACC) is used as the evaluation criterion for the verification
Wherein TPR is the accuracy of the minority class, and TNR is the accuracy of the majority class. If the verified M-ACC drops, the number of layers stops growing.
And a third part: testing unknown data
The module firstly takes the other half of samples randomly divided in the first module as test samples to form a test sample matrix, wherein a training set and a test need to meet the premise of the same probability distribution. Secondly, a model trained by the optimal over-parameter eta and the cascade layer number l is used in the testing process. And it is important to note that confidence migration is not required in the testing process, because it is the difference between the training set and the disturbance of the testing set that makes it sensitive enough to different classification thresholds, so that the unbalanced classification problem can be solved better.
Design of experiments
1) Selecting and introducing an experimental data set: KDD is short for Data Mining and Knowledge Discovery (Data Mining and Knowledge Discovery), and KDD CUP is an annual competition organized by SIGKDD (Special Interest Group on Knowledge Discovery and Data Mining) of ACM (Association for Computing machine). The KDD CUP 99 data set is a standard in the field of network intrusion detection, and lays a foundation for network intrusion detection research based on computational intelligence. Different kinds of network attack data have obvious imbalance phenomena in quantity, and the imbalance phenomena form a main factor influencing the classification performance. The experiment selected 5 unbalanced KDD Cup 99 datasets from the KEEL database. Respectively, 'land _ vs _ satan', 'side _ past _ vs _ satan', 'land _ vs _ portsweep', 'buffer _ overflow _ vs _ back' and 'rootkit-imap _ vs _ back'. The data information is shown in the following table, and the discrete features in the data are all represented by replacing one-hot.
All used data sets were checked with 5 rounds of cross-validation, i.e., the data sets were shuffled and equally divided into 5, 4 of which were used for training each time, 1 for testing, and a total of 5 rounds were performed. I.e., all data will be tested as a test set.
2) Comparing models: the system provided by the invention is named as CILDC, and the models based on the random forest are named as CILDC-RF respectively. In addition, we chose Random Forest (RF), dual subspace SVM (ABRS-SVM) and cost-sensitive based SVM (CS-SVM) as a comparison.
3) Parameter selection: the value range of a disturbance coefficient eta of the CILDC is {0.85,0.9,0.95,1,1.05,1.1 and 1.15}, the integration times of the double subspaces are all 5, the number of trees of the random forest is 50, the SVM uses an RBF kernel, the values of a relaxation coefficient C and a kernel radius sigma of the SVM are all {0.01,0.1,1,10 and 100}, and the characteristic sampling rates are all selected from {0.5,0.7 and 0.9}
4) The performance measurement method comprises the following steps: the experiments uniformly used the average accuracy M-ACC of the majority and minority classes as the evaluation criterion.
Results of the experiment
The M-ACC results for all models on each KEEL dataset are as follows. The last line in the table represents their average M-ACC and the black font represents the optimal result.
From the above table, it can be found that the CILDC-RF of the present invention can obtain the best results in most data sets, and the performance exceeds that of other comparison algorithms. Particularly, the precedence is obvious on a data set of 'rootkit-imap _ vs _ back'. In addition, the variance of the CILDC-RF is lower compared with other algorithms, which shows that the algorithm has more stable classification effect on KDD network attack data.
Claims (4)
1. The network intrusion detection method based on double subspace sampling and confidence offset is characterized in that: the method comprises the following specific steps:
1) the first step of pretreatment: constructing a network attack characteristic through a network data acquisition tool, and converting the acquired sample set characteristic into a data matrix suitable for subsequent processing;
2) a second step of pretreatment: distinguishing continuous features and discrete features in original data, and performing one-hot conversion on all discrete features;
3) training a first step: training a random forest or naive Bayes base classifier of each layer by using a random down-sampling strategy based on a sample and feature double layer, wherein the details are as follows: suppose the total number of training set samples is N, wherein the number of minority class samples is NpThe number of majority samples is Nn(ii) a For the ith sample integration, a total of S times are performed, and in the double random downsampling strategy, the majority class N 'equal to the minority class is selected randomly without being replaced in the sample set'n=NpAnd all the minority classes participate in training to obtain the integrated sample in the feature space F after the ith sample samplingSample collectionAnd then E-time feature sampling integration is carried out, for j-th time feature sampling integration, different feature subspaces F 'are randomly selected for a feature space F, wherein F' belongs to F, | F | ═ F | × delta, and h is usedi,j(x) Training is performed, where S and E are the number of integration times of the sample and feature sample, respectively, and δ is the feature sampling rate,is the sample set h after the ith sample sampling integration in the feature space Fi,j(x) The method is a base classifier after the ith sample sampling and the jth feature sampling, and the RUS is a majority class which is randomly sampled and is equal to a minority class;
4) and a second training step: performing feature disturbance on the output confidence coefficient of the base classifier of the previous layer according to the output confidence coefficient of the base classifier of the previous layer, and mixing the output confidence coefficient with the original features to be used as the input of a model of the next layer;
5) and a third training step: a verification mechanism is added into the cascade model, so that the number of layers can be adaptively stopped from increasing;
6) and (3) testing: and inputting the test data set into the obtained cascade model to finally obtain a detection classification result of the network intrusion.
2. The method of claim 1, wherein the network intrusion detection method based on dual subspace sampling and confidence offsets is characterized in that: and in the second training step, according to the output confidence coefficient of the last-layer base classifier, performing feature disturbance on the last-layer base classifier, and mixing the feature with the original features to be used as the input of the next-layer model, wherein the detailed description is as follows: the base classifier used by the cascade interpolation integration model is random forest and naive Bayes, wherein the confidence coefficient calculation mode of the Random Forest (RF) is
Can be intuitively understood as the mean value of the sample proportion of the belonged category y' in the leaf node, and the confidence degree of Naive Bayes (NB) is calculated in the way of
Representing the posterior probability of class y', the base classifier inside each layer is used to generate confidence by 3-fold cross validation to prevent overfitting, and the confidence offset process passed by the resulting confidence vector V is as follows
V′l(i,ymajority)=Vl(i,ymajority)×η
V′l(i,yminority)=Vl(i,yminority)/η,
Where 1 is the current number of layers, Vl(i,ymaiority) Is the probability, V, that a sample i in layer 1 belongs to the majority classl(i,yminority) The probability that a sample i in a layer belongs to a minority class, eta is a hyper-parameter, the general value range is a neighbor of 1, the confidence coefficient of a majority class is multiplied by a parameter eta, the confidence coefficient of a minority class is divided by eta, the bias weight of the majority class/the minority class is dynamically adjusted layer by layer through disturbance of the confidence coefficient, and finally the feature V 'after disturbance'lWill be mixed with the original features and interpolated as the input of the next layer model
Wherein X0The method has the characteristics that the method has the original characteristics,is a real number set, l is the current layer number, the sample number is m, the dimensionality is d, and the dimensionality of the interpolation confidence coefficient is NclassI.e. the number of categories.
3. The method of claim 1, wherein the network intrusion detection method based on dual subspace sampling and confidence offsets is characterized in that: the third step of training, adding a verification mechanism in the cascade model to enable the number of layers to be self-adaptive and stop increasing, and the method is specifically realized as follows: the number of layers of the cascade interpolation model is limited by 2, and in the experiment, the maximum number of layers cannot exceed 5; secondly, each layer is subjected to a verification process after training is finished and before all the layers, and the average accuracy (M-ACC) is used as an evaluation standard of verification
Where TPR is the accuracy of the minority class and TNR is the accuracy of the majority class, the number of layers stops increasing if the verified M-ACC is somewhat degraded.
4. The method of claim 1, wherein the network intrusion detection method based on dual subspace sampling and confidence offsets is characterized in that: in the testing stage, a testing data set is input into the obtained cascade model, the characteristics of the cascade model do not need to be disturbed in the process of layer-by-layer interpolation, and the specific operation is as follows: the training set and the test need to satisfy the premise of the same probability distribution, and secondly, the optimal hyper-parameter eta and the model trained by the cascade layer number I are used in the test process.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910490598.XA CN110177112B (en) | 2019-06-05 | 2019-06-05 | Network intrusion detection method based on double subspace sampling and confidence offset |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910490598.XA CN110177112B (en) | 2019-06-05 | 2019-06-05 | Network intrusion detection method based on double subspace sampling and confidence offset |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110177112A CN110177112A (en) | 2019-08-27 |
CN110177112B true CN110177112B (en) | 2021-11-30 |
Family
ID=67697332
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910490598.XA Active CN110177112B (en) | 2019-06-05 | 2019-06-05 | Network intrusion detection method based on double subspace sampling and confidence offset |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110177112B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112016597B (en) * | 2020-08-12 | 2023-07-18 | 河海大学常州校区 | Depth sampling method based on Bayesian unbalance measurement in machine learning |
CN116226629B (en) * | 2022-11-01 | 2024-03-22 | 内蒙古卫数数据科技有限公司 | Multi-model feature selection method and system based on feature contribution |
CN117240602B (en) * | 2023-11-09 | 2024-01-19 | 北京中海通科技有限公司 | Identity authentication platform safety protection method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108023876A (en) * | 2017-11-20 | 2018-05-11 | 西安电子科技大学 | Intrusion detection method and intruding detection system based on sustainability integrated study |
CN108304884A (en) * | 2018-02-23 | 2018-07-20 | 华东理工大学 | A kind of cost-sensitive stacking integrated study frame of feature based inverse mapping |
CN109347872A (en) * | 2018-11-29 | 2019-02-15 | 电子科技大学 | A kind of network inbreak detection method based on fuzziness and integrated study |
CN109460872A (en) * | 2018-11-14 | 2019-03-12 | 重庆邮电大学 | One kind being lost unbalanced data prediction technique towards mobile communication subscriber |
EP3336739B1 (en) * | 2016-12-18 | 2020-02-26 | Deutsche Telekom AG | A method for classifying attack sources in cyber-attack sensor systems |
-
2019
- 2019-06-05 CN CN201910490598.XA patent/CN110177112B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3336739B1 (en) * | 2016-12-18 | 2020-02-26 | Deutsche Telekom AG | A method for classifying attack sources in cyber-attack sensor systems |
CN108023876A (en) * | 2017-11-20 | 2018-05-11 | 西安电子科技大学 | Intrusion detection method and intruding detection system based on sustainability integrated study |
CN108304884A (en) * | 2018-02-23 | 2018-07-20 | 华东理工大学 | A kind of cost-sensitive stacking integrated study frame of feature based inverse mapping |
CN109460872A (en) * | 2018-11-14 | 2019-03-12 | 重庆邮电大学 | One kind being lost unbalanced data prediction technique towards mobile communication subscriber |
CN109347872A (en) * | 2018-11-29 | 2019-02-15 | 电子科技大学 | A kind of network inbreak detection method based on fuzziness and integrated study |
Non-Patent Citations (1)
Title |
---|
基于主动学习的非均衡异常数据分类算法研究;王波、王怀彬;《信息网络安全》;20171030;第46页 * |
Also Published As
Publication number | Publication date |
---|---|
CN110177112A (en) | 2019-08-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Song et al. | Variable-size cooperative coevolutionary particle swarm optimization for feature selection on high-dimensional data | |
Grathwohl et al. | Your classifier is secretly an energy based model and you should treat it like one | |
Isa et al. | Using the self organizing map for clustering of text documents | |
CN110177112B (en) | Network intrusion detection method based on double subspace sampling and confidence offset | |
CN110266672B (en) | Network intrusion detection method based on information entropy and confidence degree downsampling | |
CN108898154A (en) | A kind of electric load SOM-FCM Hierarchical clustering methods | |
CN105160249B (en) | A kind of method for detecting virus based on improved Artificial neural network ensemble | |
CN110009030B (en) | Sewage treatment fault diagnosis method based on stacking meta-learning strategy | |
CN112001788B (en) | Credit card illegal fraud identification method based on RF-DBSCAN algorithm | |
Al Iqbal et al. | Knowledge based decision tree construction with feature importance domain knowledge | |
CN115048988B (en) | Unbalanced data set classification fusion method based on Gaussian mixture model | |
Satyanarayana et al. | Survey of classification techniques in data mining | |
Wang et al. | An improving majority weighted minority oversampling technique for imbalanced classification problem | |
CN109409434A (en) | The method of liver diseases data classification Rule Extraction based on random forest | |
Fu et al. | Construction and reasoning approach of belief rule-base for classification base on decision tree | |
CN114091661A (en) | Oversampling method for improving intrusion detection performance based on generation countermeasure network and k-nearest neighbor algorithm | |
Gao et al. | Machine learning for credit card fraud detection | |
Zhang et al. | Consumer credit risk assessment: A review from the state-of-the-art classification algorithms, data traits, and learning methods | |
Zhou et al. | Credit card fraud identification based on principal component analysis and improved AdaBoost algorithm | |
Mao et al. | Naive Bayesian algorithm classification model with local attribute weighted based on KNN | |
Nakashima et al. | Incremental learning of fuzzy rule-based classifiers for large data sets | |
Alcalá et al. | Generating single granularity-based fuzzy classification rules for multiobjective genetic fuzzy rule selection | |
Trawinski et al. | Embedding evolutionary multiobjective optimization into fuzzy linguistic combination method for fuzzy rule-based classifier ensembles | |
Zhang et al. | Unbalanced data classification based on oversampling and integrated learning | |
Li et al. | Study on the Prediction of Imbalanced Bank Customer Churn Based on Generative Adversarial Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |