CN110336768B

CN110336768B - Situation prediction method based on combined hidden Markov model and genetic algorithm

Info

Publication number: CN110336768B
Application number: CN201910060212.1A
Authority: CN
Inventors: 高岭; 毛勇; 郑杰; 杨旭东; 冯通; 张晓�
Original assignee: Northwestern University
Current assignee: Northwestern University
Priority date: 2019-01-22
Filing date: 2019-01-22
Publication date: 2021-07-20
Anticipated expiration: 2039-01-22
Also published as: CN110336768A

Abstract

A situation prediction method based on a combined hidden Markov model and genetic algorithm is characterized in that redundant alarms and false reports are processed by utilizing an artificial fish swarm optimization fuzzy clustering method, and the artificial fish swarm optimization can well overcome the defect that fuzzy c-means clustering is sensitive to an initial clustering center, so that the aim of optimizing alarm clustering precision is fulfilled. And meanwhile, aiming at the problem that the local optimization of a training result is easily caused by improper setting of initial parameters of the hidden Markov model in the training process, the clustered alarm is used as input, the initial value of the hidden Markov is optimized by using a genetic algorithm, the optimized parameters are further trained by using a Bowmember algorithm, finally the parameters of the hidden Markov model under the maximum likelihood estimation are obtained, and the security situation is predicted by combining a Viterbi algorithm with an observation value. The method can improve the accuracy of network security situation prediction.

Description

Situation prediction method based on combined hidden Markov model and genetic algorithm

Technical Field

The invention belongs to the technical field of information security, and particularly relates to a situation prediction method based on a combined hidden Markov model and a genetic algorithm.

Background

With the development of internet technology, more and more services are carried by the internet technology. Electric power, water conservancy, communication, banking, transportation, education, military, etc. are all independent of the internet. Various services borne on the Internet and various stored information are all the embodiments of physical and practical values. The appearance of bitcoin further blurs the boundary between the virtual network world and the real world. The network world has huge information quantity and is complex. The internet is freely, conveniently and quickly accessed, so that the use of the internet by people all over the world is not limited by time and places, and the network security is concerned more and more. In recent years, attack tools and methods in networks are becoming more and more complex, and the requirements of security highly sensitive departments cannot be met only by means of traditional security measures. The traditional protection means adopted aiming at network safety is dispersed and single, and various network key factors cannot be comprehensively judged from a macroscopic view. It is in this context that emerging research into the awareness of network security posture has emerged.

The network security situation awareness is to acquire, understand and evaluate key element data in a network, and finally predict the security situation of the whole network according to an evaluation result, wherein a specific network security situation awareness framework is shown in fig. 2. The situation prediction is realized by continuously detecting the network state, and when the network state is abnormal, the next state of the network is predicted by using a known prediction model. The existing situation prediction method based on the hidden Markov model is trained by combining an EM algorithm with an actual network observation value, and when the network is abnormal, the trained model is used for predicting the network situation value, so that the following defects exist:

the existing clustering method has the problem of sensitivity to an initial clustering center when being applied to intrusion detection alarm processing, so that the analysis of an alarm result is not accurate enough. Thereby affecting the training of the final model and failing to obtain an accurate model well.

Due to inherent defects of the hidden Markov model, when the EM algorithm is used for training, the selection result of the initial value is poor due to the selection standard defects of the initial value, and therefore a local optimal training result appears.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide a situation prediction method based on a combined hidden Markov model and a genetic algorithm, which adopts a fish swarm algorithm optimized fuzzy clustering method in the alarm initialization process to effectively overcome the defect that alarm clustering analysis is easy to fall into a local extreme value, improves the precision of an alarm clustering result, and simultaneously utilizes a swarm intelligent perception algorithm to optimize the hidden Markov situation prediction model to well train the model and avoid local optimization, thereby enabling the network security situation prediction result to be more accurate.

In order to achieve the purpose, the invention adopts the technical scheme that:

a situation prediction method based on a combined hidden Markov model and a genetic algorithm is characterized by comprising the following steps:

the method comprises the following steps: according to the collected intrusion detection alarms, preprocessing the intrusion detection alarms by an intrusion detection alarm clustering method based on artificial fish school optimization fuzzy mean clustering is carried out on the collected intrusion detection alarms, so that the purpose of simplifying and accurately classifying the alarms is achieved, and the processed result is used as an external observation value of a network;

according to the collected intrusion detection alarms, carrying out preprocessing of an intrusion detection alarm clustering method based on artificial fish swarm optimization fuzzy mean clustering on the collected intrusion detection alarms, wherein the preprocessing comprises the following steps:

1): initializing intrusion detection system alarms: removing unnecessary attributes and carrying out preliminary aggregation on multi-source heterogeneous data;

2) carrying out weight distribution on the alarm attribute by using a consistent matrix method;

3) establishing a fuzzy similarity matrix of the alarm by using a self-defined alarm attribute similarity function and a weight relation;

4) establishing a fuzzy equivalent matrix by using a transmission closed-packet method, and establishing an artificial fish individual for each alarm;

5) constructing a food concentration function, and mapping the high-dimensional sample to a three-dimensional plane;

6) performing FCM clustering based on an artificial fish swarm algorithm, wherein the FCM clustering comprises the following steps:

1) defining an error function of the artificial fish swarm algorithm:

wherein rij¹rij represents the euclidean distance between sample i and sample j mapped from the higher order sample to the three-dimensional plane, assuming that the coordinate values of i and j are (a)_i，b_i，c_i)、(a_j，b_j， c_j) Then rij¹：

rij is the value of the corresponding position in the fuzzy equivalent matrix established in the fourth step;

2) defining a food concentration function for an individual:

3) randomly distributing samples to be clustered, which are mapped from a high dimension to three dimensions, in a three-dimensional space, and randomly assigning a three-dimensional coordinate value to each sample;

4) calculating the food concentration of the artificial fish;

5) performing optimization behaviors such as herd gathering, foraging and rear-end collision on the basis of the current food concentration of the fish school;

6) if all the artificial fishes in the group finish moving, continuing to execute downwards, otherwise, turning to the step 4);

7) if the difference between the updated individual maximum food concentration value of the artificial fish and the maximum food concentration function value before updating is smaller than a certain specified value, or the updating times reach the specified maximum times, ending, otherwise, turning to the step 4);

8) clustering by applying an FCM algorithm to obtain three-dimensional coordinate values, and mapping the final result to the original high-dimensional sample;

step two: determining the number N of the hidden states of the network according to the network risk level, carrying out interval division on the initial probability of each hidden state according to expert experience, and carrying out interval division on the transition probability between the hidden states and the output probability from the hidden states to the display states;

step three: according to the initial probability interval matrix and the transition probability interval matrix of each hidden state divided in the second step, the output probability interval matrix takes random numbers in the interval and is normalized to respectively generate P hidden Markov initial probability matrixes pi, a transition probability matrix A and an output probability matrix B;

p hidden Markov initial probability matrixes pi, a transition probability matrix A and an output probability matrix B are respectively generated randomly, and the specific normalization result met by the generated probability matrix meets the following formula:

step four: encoding the generated P initial probability matrixes by adopting a floating point number encoding method; the three parameter matrixes of the chromosome generated by the adopted floating point number coding method corresponding to the hidden Markov model respectively comprise three parts, a hidden state initial probability matrix corresponds to an initial chromosome Ge pi, a hidden state transition probability matrix corresponds to a transition chromosome GeA, and an output matrix from a hidden state to a display state corresponds to an output chromosome GeB;

step five: calculating the fitness values of all P chromosomes, and directly copying the individuals with the maximum fitness values to the next population in order to prevent the randomness of a genetic algorithm from damaging the individuals with the optimal fitness values in the current population, namely the optimal storage strategy;

step six: for the last P-1 chromosomes, calculating the weighted sum of the support degree and the fitness value of the chromosome to the dispersion of the population, and combining the roulette rule to enable the population scale to reach P again;

the individual support degree calculation mode for the dispersion of the population relates to the following definition:

definition 1: defining the size of the population as S, and defining that one chromosome contains Q ═ m × N + N × N + N genes, and the chromosome k is formed from G_k＝(G_k1,G_k2...G_kQ) S denotes k ═ 1,2.. S;

definition 2: chromosome fitness function f: since the optimal chromosome individual solved by the genetic algorithm is the initial parameter matrix of Hmm, the forward probabilities of all chromosomes are used as the fitness function, i.e. the

f＝P(O/λ)；

Definition 3: defining individual phenotypes^ηk, i.e. the ratio of the fitness value of chromosome k to the sum of population fitness values

Definition 4: defining population dispersion d

Definition 5: defining the support degree of the kth chromosome on the dispersion of the population as follows;

step seven: determining the cross probability according to the support degree, and completing the genetic cross among individuals by adopting an arithmetic cross mode according to the following steps:

1): randomly selecting a chromosome k, and calculating the formula:

wherein Spt_maxRepresenting the maximum support, Spt_minRepresents the minimum support, Spt_kRepresents randomly selected chromosome support;

2): and generating a random number r, and if r < Sptr, determining the chromosome k as a chromosome to be crossed. Repeating the two steps until two chromosomes to be crossed are generated;

3): and (3) carrying out genetic crossing on the two chromosomes to be crossed, wherein the crossing principle is as follows: ge pi 1 crosses Ge pi 1, GeA1 crosses GeA2, GeB1 crosses GeB 2;

step eight: determining variation probability according to the support degree, and completing individual genetic variation by adopting a non-uniform variation mode; the variation mode is as follows:

wherein G is_kFor randomly selected k chromosome before mutation, G_k' is G_kAltered chromosome, G_maxAnd G_minThe individuals with the maximum and minimum current fitness are respectively. t is (0 to 1)]An inter-variance constant, r is a random number; if random integer rand () is even, G is used_k’＝G_k+t(G_max-G_k) The variation of r is odd by G_k’＝G_k+t(G_k-G_mix) The mutation mode of r is a synchronization step seven by utilizing the mode of determining the mutation probability by utilizing the support degree;

step nine: carrying out individual normalization processing on the new born population subjected to genetic replication, genetic crossing and genetic variation to meet the hidden Markov parameter constraint condition;

step ten: and checking whether a preset iteration termination condition is met, if so, terminating, selecting the chromosome with the maximum fitness value as a global optimum value, and mapping the chromosome to three initial matrixes of the hidden Markov model. Otherwise, returning to the step five to carry out a new round of evolution;

step eleven: carrying out iterative training on the model parameter lambda (pi, A and B) obtained in the step ten by adopting a Bowmville algorithm to obtain a maximum likelihood estimation parameter of the hidden Markov model; performing iterative training on the lambda (pi, A and B) obtained in the step ten by using a Bowmville algorithm to obtain the maximum likelihood estimation parameter of the HMM model, wherein the method comprises the following steps of:

1) d alarm sequence data samples { O ] are obtained according to the intrusion detection alarm clustering method of the artificial fish swarm optimization fuzzy mean clustering₁,O₂,...O_DH, any alarm sequence O thereof_d＝{o₁ ^(d),o₂ ^(d),o₃ ^(d),....o_T ^(d)}；

2) Optimizing according to the genetic algorithm to obtain an optimal initial value lambda (pi, A, B);

3) for each sample D1, 2.. D, γ is calculated using a forward-backward algorithm_t ^(d)(i)，ξ_t ^(d)(i,j),t＝1,2...T；

4) Updating the model parameter matrix;

5) checking whether each matrix meets a convergence condition, if so, finishing the algorithm, and otherwise, returning to the step (3) for iterative execution;

step twelve: if the network state is abnormal, the network security situation can be predicted by utilizing a Viterbi algorithm through collecting external observation values and a trained hidden Markov model.

The invention has the following advantages:

1. the collected alarm data are classified by combining the artificial fish swarm algorithm and the fuzzy clustering, so that the defect that the accuracy of a clustering result is low due to the fact that a traditional clustering method is sensitive to an initial clustering center in the process of processing redundant alarms is effectively overcome. Thereby improving the situation prediction accuracy.

2. And (3) adopting a genetic algorithm and a hidden Markov model to predict the situation, inputting an initial value of an optimization result generated by processing the genetic algorithm into the BombWilch algorithm, and adopting detected and processed network alarm data as an observation value to carry out iterative training on the optimization result to obtain a parameter value. The method effectively overcomes the defect that the local optimization of the training result is caused by improper initial value selection in the situation prediction process of the traditional hidden Markov model.

Drawings

Fig. 1 is a working principle diagram of the present invention.

Fig. 2 is a network security situation awareness framework diagram.

FIG. 3 is a flow chart of the fish swarm algorithm-fuzzy clustering alarm processing steps of the present invention.

FIG. 4 is a diagram of the genetic algorithm optimization process of the present invention.

Detailed Description

The present invention will be further described with reference to the following examples and drawings, but the present invention is not limited to the following examples.

The invention provides a situation prediction method based on a combined hidden Markov model and a genetic algorithm, which aims at the problem that the situation prediction method of the hidden Markov in the existing network security situation perception method has theoretical defects and easily leads to local optimization of a training result, and proposes to optimize an initial parameter by adopting a swarm intelligence perception theory so that a Bowmember algorithm can obtain a parameter value with higher fitness in the initial stage of training. In the initialization process of training data, a combined artificial fish school algorithm and a c-means clustering method are adopted to remove false alarms and redundant alarms. The combined use of the two methods can improve the accuracy of the situation prediction result to a great extent, so that a network security administrator can more accurately obtain the real situation of the network security situation.

Fig. 1 is a schematic diagram of the operation of the present invention. Specifically, the alarm data of the intrusion detection system after being processed is used as input, and after the data is initialized, the data is processed by adopting an improved clustering method and is used as a situation observation value. After the initial parameters of the existing hidden Markov prediction model are optimized, model training is carried out by utilizing the Bowman's algorithm in combination with the situation observation value, and finally the maximum likelihood model parameter values of the observation sequence are obtained. And predicting the situation value of the network by using the observation sequence and the Viterbi algorithm. The method specifically comprises the following steps:

1) according to the preprocessed intrusion detection alarm, preprocessing the intrusion detection alarm clustering method based on artificial fish school optimization fuzzy clustering is carried out on the intrusion detection alarm, so that the purpose of simplifying and accurately classifying the alarm is achieved, and the processed result is used as an external observation value of the network;

2) determining the number N of the hidden states of the network according to the network risk level, carrying out interval division on the initial probability of each hidden state according to expert experience, and carrying out interval division on the transition probability between the hidden states and the output probability from the hidden states to the display states;

3) according to the initial probability interval matrix and the transition probability interval matrix of each hidden state divided in the second step, the output probability interval matrix takes random numbers in the interval and is normalized to respectively generate P hidden Markov model initial probability matrixes pi, a transition probability matrix A and an output probability matrix B;

4) encoding the generated P initial probability matrixes by adopting a floating point number encoding method;

5) calculating the fitness values of all P chromosomes, and directly copying the individuals with the maximum fitness values to the next population in order to prevent the randomness of a genetic algorithm from damaging the individuals with the optimal fitness values in the current population, namely the optimal storage strategy;

6) for the last P-1 chromosomes, calculating the weighted sum of the support degree and the fitness value of the chromosome to the dispersion of the population, and combining the roulette rule to enable the population scale to reach P again;

7) determining the crossover probability according to the support degree, and completing genetic crossover among individuals by adopting an arithmetic crossover mode;

8) determining variation probability according to the support degree, and completing individual genetic variation by adopting a non-uniform variation mode;

9) carrying out individual normalization processing on the new born population subjected to genetic replication, genetic crossing and genetic variation to meet the hidden Markov parameter constraint condition;

10) and checking whether a preset iteration termination condition is met, if so, terminating, selecting the chromosome with the maximum fitness value as a global optimum value, and mapping the chromosome to three initial matrixes of the hidden Markov model. Otherwise, returning to the step 5) to begin a new round of evolution;

11) carrying out iterative training on the model parameter lambda (pi, A, B) obtained in the step 10) by adopting a Bowmville algorithm to obtain a maximum likelihood estimation parameter of the hidden Markov model;

12) if the network state is abnormal, the network security situation can be predicted by utilizing a Viterbi algorithm through collecting external observation situation values and a trained hidden Markov model.

Fig. 3 is a fish school algorithm-fuzzy clustering alarm processing step. Specifically, according to the collected intrusion detection alarms, preprocessing of an intrusion detection alarm clustering method based on artificial fish swarm optimization fuzzy clustering is carried out on the collected intrusion detection alarms, and the method specifically comprises the following steps:

(1): initializing intrusion detection system alarms: removing unnecessary attributes and carrying out preliminary aggregation on multi-source heterogeneous data, and the method comprises the following steps:

1) inputting a piece of alarm information x_iIf i is 1, its alarm type (1) is recorded, and the type number counter t is 1

2) When i is>When the alarm is 2, the type (x) is judged to be the type (1) to the type (t) which are identified at present and the type (x) of the current alarm_i) As a result of comparison of (i) with

3) When i ═ n, for each of the t classes of alarm data, classifying according to a predefined length of time;

(2) the method for distributing the weights of the alarm attributes by using the consistent matrix method specifically comprises the following steps:

1) according to expert experience, the m attributes of the intrusion detection alarm are subjected to pairwise attribute importance degree ratio scoring to obtain a judgment matrix

Wherein x_ijThe ratio of the importance of the ith and jth attributes;

2)

each factor is weighted by (β)₁，β₂，。。。，，β_i，。。。，β_n)；

(3) Establishing a fuzzy similarity matrix of the alarm by using a self-defined alarm attribute similarity function and a weight relation, wherein the attribute similarity function specifically comprises the following steps:

1) time similarity function:

2) port similarity function:

3) source/destination ip address similarity function

(η is the same number of bits from left to right for both source/destination ip addresses);

4) a protocol similarity function;

similarity x of the ith alarm and the jth alarm_ijThe calculation formula is as follows;

(where m is the number of attributes,

for the ith and jth alarms

Similarity values of individual attributes);

(4) establishing a fuzzy equivalent matrix by using a transmission closed-packet method, and establishing an artificial fish individual for each alarm;

(5) constructing a food concentration function, and mapping the high-dimensional sample to a three-dimensional plane;

(6) performing FCM clustering based on an artificial fish swarm algorithm, wherein the FCM clustering comprises the following steps:

1) defining an error function of the artificial fish swarm algorithm:

wherein r is_ij' represents a Euclidean distance between a sample i and a sample j mapped from a high-order sample to a three-dimensional plane, assuming that coordinate values of i and j are (a) respectively_i，b_i，c_i)、(a_j，b_j，c_j) Then rij¹：

r_ij ^*Is the value of the corresponding position in the fuzzy equivalent matrix established in the step four;

2) defining a food concentration function for an individual:

3) randomly distributing the samples to be clustered from high dimension to three dimension in a three-dimensional space, and randomly assigning a three-dimensional coordinate value to each sample

4) Calculating the food concentration of the artificial fish

5) Performing optimal behaviors such as herding, foraging and rear-end collision on the basis of the current food concentration of fish herds

6) If all the artificial fishes in the group finish moving, continuing to execute downwards, otherwise, turning to (4)

7) If the difference between the updated individual maximum food concentration value of the artificial fish and the maximum food concentration function value before updating is less than a certain specified value or the updating times reach the specified maximum times, ending the process, otherwise turning to (4)

8) And (4) clustering by applying an FCM algorithm to obtain three-dimensional coordinate values, and mapping the final result to the original high-dimensional sample.

FIG. 4 is a diagram of genetic algorithm optimization process. Specifically, P hidden Markov initial probability matrixes pi, a transition probability matrix A and an output probability matrix B are respectively generated randomly. The specific normalization result satisfied by the generated probability matrix satisfies the following formula:

the chromosome generated by the floating point number coding method and corresponding to three parameter matrixes of a hidden Markov model respectively comprises three parts, a hidden state initial probability matrix corresponds to an initial chromosome Ge pi, a hidden state transition probability matrix corresponds to a transition chromosome GeA, and an output matrix from a hidden state to an explicit state corresponds to an output chromosome GeB, as shown in figure 1:

the specific calculation mode of the individual support degree for the dispersion of the population relates to the following definitions:

f＝P(O/λ)

Definition 3: defining individual phenotype eta k, i.e. the ratio of fitness value of chromosome k to the sum of population fitness values

Definition 4: defining the dispersion d of the population;

definition 5: defining the support degree of the kth chromosome on the dispersion of the population as

The method combines the roulette rule to enable the population size to reach P again, and comprises the following specific steps:

(1): calculating the formula t_i＝uf_i+vSpt_iWherein u and v are respectively the weight occupied by the fitness value and the support value;

(2): calculation formula T_n＝∑_uf_i+vSpt_i；

(3): calculation formula W_i＝t_i/T_n；

(4): calculating cumulative probability

(5): randomly generating a random number r satisfying 0-1 in uniform distribution, and adding r and g_iIf g is compared_i-1<r<g_iSelecting an individual i to enter a next generation new group; repeating (4) and (5) until the number of new populations generated is equal to the parent population size;

determining the crossover probability according to the support degree, and finishing the genetic crossover among individuals by adopting an arithmetic crossover mode.

(1): randomly selecting a chromosome k, and calculating the formula:

wherein Spt_maxRepresenting maximum support，Spt_minRepresents the minimum support, Spt_kRepresents randomly selected chromosome support;

(2): and generating a random number r, and if r < Sptr, determining the chromosome k as a chromosome to be crossed. Repeating the two steps until two chromosomes to be crossed are generated

(3): carrying out genetic crossing on two chromosomes to be crossed, wherein the crossing principle is as follows: ge π 1 crosses Ge π 1, GeA1 and GeA2, and GeB1 and GeB 2. The specific interleaving operations used are as follows:

1) parent generation: ge pi 1 ═ pi₁₁，π₁₂，。。。π_1n}，Geπ1＝{π₂₁，π₂₂，。。。π_2n}

2) Random selection of a Gene j

3) And (3) filial generation: ge pi 1 ═ pi₁₁，π₁₂，...a_*π_1k+(1-a)π_2k，a*π_1(k+1)+(1-a)π_2(k+1)...a*π_1n+(1-a)π_2nA is a random number between 0 and 1, and the intersection of the transfer matrix and the output matrix is the same, which is not described again;

and determining the mutation probability according to the support degree, and completing the genetic mutation of the individuals by adopting a non-uniform mutation mode. Specifically, the following formula is used:

the variation mode is as follows:

wherein G is_kFor randomly selected k chromosome before mutation, G_k' is the variant G_kA mutated chromosome. G_maxAnd G_minThe individuals with the maximum and minimum current fitness are respectively. t is (0 to 1)]The inter-variance constant, r, is a random number. I.e. G is used when random integer rand () is even_k’＝G_k+t(G_max-G_k) The variation of r is odd by G_k’＝G_k+t(G_k-G_mix) The variation of r.

Performing iterative training on the obtained lambda (pi, A and B) by using a BombWelch algorithm to obtain a maximum likelihood estimation parameter of the HMM model, and specifically comprising the following steps:

(1): d alarm sequence data samples { O ] are obtained according to the intrusion detection alarm clustering method of the artificial fish swarm optimization fuzzy mean clustering₁,O₂,...O_D0 of any alarm sequence_d＝{o₁ ^(d),o₂ ^(d),...o_T ^(d)And according to said claim 2

(2): optimizing according to the genetic algorithm to obtain an optimal initial value lambda (pi, A, B)

(3): for each sample D1, 2.. D, γ is calculated using a forward-backward algorithm_t ^(d)(i)，ξ_t ^(d)(i, j), T ═ 1,2.. T, where

Where Ci (i) is the forward probability, β_i(i) Is a backward probability, a_ijFor transition probabilities, bj (a +1) is the output probability

(4): the model parameters are updated according to the following formula:

(5): and (4) checking whether each matrix meets a convergence condition, if so, finishing the algorithm, and otherwise, returning to the step (3) for iterative execution.

And obtaining the trained hidden Markov model parameters through the steps. When the network is not normally operated, the idea of the prediction algorithm is as follows:

(1) and acquiring a network situation observation value sequence.

(2) And acquiring the trained hidden Markov model parameters.

(3) A sequence of maximized hidden states is computed according to the viterbi algorithm.

(4) And determining the network security situation value at the next moment according to the state transition matrix.

Claims

1. A situation prediction method based on a combined hidden Markov model and a genetic algorithm is characterized by comprising the following steps:

1) defining an error function of the artificial fish swarm algorithm:

wherein r is_ij' represents a Euclidean distance between a sample i and a sample j mapped from a high-order sample to a three-dimensional plane, assuming that coordinate values of i and j are (a) respectively_i，b_i，c_i)、(a_j，b_j，c_j) Then r_ij′：

r_ij ^*Is the value of the corresponding position in the fuzzy equivalent matrix established in the step 4);

2) defining a food concentration function for an individual:

4) calculating the food concentration of the artificial fish;

step two: determining the number N of the hidden states of the network according to the network risk level, carrying out interval division on the initial probability of each hidden state, and carrying out interval division on the transition probability between the hidden states and the output probability from the hidden states to the display states;

i＝1,2,3....N……；

step six: and for the last P-1 chromosomes, calculating the weighted sum of the support degree and the fitness value of the chromosome to the dispersion of the population, and combining a roulette rule to enable the population to reach the P again, wherein the method comprises the following specific steps:

(1): calculating the formula t_i＝uf_i+vSpt_iWherein u and v are respectively adaptiveThe weight of the value and the support value;

(2): calculation formula T_n＝∑uf_i+vSpt_i；

(3): calculation formula W_i＝t_i/T_n；

(4): calculating cumulative probability

(5): randomly generating a random number r satisfying 0-1 in uniform distribution, and adding r and g_iIf g is compared_i-1＜r＜g_iSelecting an individual i to enter a next generation new group; repeating (4) and (5) until the number of new populations generated is equal to the parent population size;

definition 1: defining the size of the population as S, and defining that one chromosome contains Q ═ m × N + N × N + N genes, and the chromosome k is formed from G_k＝(G_k1，G_k2...G_kQ) S denotes k ═ 1,2.. S;

f＝P(0/λ)；

Definition 3: defining an individual phenotype eta_kI.e. the ratio of the fitness value of chromosome k to the sum of population fitness values

k＝1，2，...S；

Definition 4: defining population dispersion d

1): randomly selecting a chromosome k, and calculating the formula:

2): generating a random number r, if r is less than Sptr, determining the chromosome k as a chromosome to be crossed, and repeatedly executing the two steps until two chromosomes to be crossed are generated;

wherein G is_kFor randomly selected k chromosome before mutation, G_k' is G_kAltered chromosome, G_maxAnd G_minThe individuals with the maximum and minimum current fitness are respectively, and t is (0-1)]An inter-variance constant, r is a random number; if random integer rand () is even, G is used_k′＝G_k+t·(G_max-G_k) The variation of r, if it is odd, is G_k′＝G_k+t·(G_k-G_min) The manner of variation of r;

step ten: checking whether a preset iteration termination condition is met, if so, terminating, selecting the chromosome with the maximum fitness value as a global optimum value, mapping the chromosome to three initial matrixes of the hidden Markov model, and otherwise, returning to the fifth step to carry out a new round of evolution;

(1): d alarm sequence data samples { O ] are obtained according to the intrusion detection alarm clustering method of the artificial fish swarm optimization fuzzy mean clustering₁，O₂，...O_DH, any alarm sequence O thereof_d＝{o₁ ^(d)，O₂ ^(d)，...o_T ^(d)}；

Wherein alpha is_t(i) Is a forward probability, beta_t(i) Is a backward probability, a_ijIn order to make the probability transition,

is the output probability;

(4): the model parameters are updated according to the following formula:

(5): checking whether each matrix meets a convergence condition, if so, finishing the algorithm, and otherwise, returning to the step (3) for iterative execution;