CN113542271A - Network background flow generation method based on generation of confrontation network GAN - Google Patents

Network background flow generation method based on generation of confrontation network GAN Download PDF

Info

Publication number
CN113542271A
CN113542271A CN202110796467.1A CN202110796467A CN113542271A CN 113542271 A CN113542271 A CN 113542271A CN 202110796467 A CN202110796467 A CN 202110796467A CN 113542271 A CN113542271 A CN 113542271A
Authority
CN
China
Prior art keywords
network
data packet
equal
traffic
flow data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110796467.1A
Other languages
Chinese (zh)
Other versions
CN113542271B (en
Inventor
董庆宽
任晓龙
陈原
赵晓倩
杨福兴
穆涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202110796467.1A priority Critical patent/CN113542271B/en
Publication of CN113542271A publication Critical patent/CN113542271A/en
Application granted granted Critical
Publication of CN113542271B publication Critical patent/CN113542271B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Evolutionary Computation (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a network background flow generation method based on generation of a confrontation network GAN, which comprises the following steps: 1) acquiring a training sample set; 2) constructing and generating an confrontation network model library; 3) performing iterative training on the generated confrontation network model library; 4) acquiring traffic data packet characteristics obtained by the prediction of a trained generator network; 5) the network traffic generates a result. The invention carries out iterative training on the model base which comprises a plurality of generation countermeasure networks with the same type as the network application through the training sample set comprising the network flow data packet characteristics of various network applications, accelerates the convergence speed of the generation countermeasure network model base, and effectively improves the efficiency of generating the network background flow on the premise of ensuring the communication safety.

Description

Network background flow generation method based on generation of confrontation network GAN
Technical Field
The invention belongs to the technical field of network security, relates to a network background traffic generation method, and particularly relates to a network background traffic generation method based on generation of a confrontation network GAN, which can be used for generating network background traffic.
Background
When communication nodes in the internet use network application to communicate, the communication nodes need to perform interaction of traffic data packets, and one network traffic sent by the communication nodes contains a group of packet sequences
Figure BDA0003162952940000011
Wherein
Figure BDA0003162952940000012
A-th indicating that the communication node needs to transmitiA traffic packet.
Operators providing network application services need a large number of network traffic data packet samples during network security analysis, network pressure test and the like, and network traffic generation technology is continuously developed. The network traffic generation method mainly comprises a network traffic generation method based on a statistical model and a network traffic generation method based on traffic characteristics.
The network traffic generation method based on the statistical model mainly generates traffic by means of probability models such as Markov models, Poisson distribution models and other matching traffic generation tools, and the network traffic generation method mainly generates background network traffic during internet pressure testing.
The network flow generation method based on the flow data packet features mainly utilizes a machine learning technology to extract features of a flow data packet to serve as a training sample set of a neural network, then the neural network is built for iterative training, finally, the network flow features are predicted and output, then a flow generation tool is used for generating an initial data packet sequence according to the predicted network flow features, and data required to be sent by a user are encrypted and then embedded into the initial data packet sequence to generate a network flow.
The generation countermeasure network can predict the traffic data packet characteristics, so that the probability distribution of the traffic data packet characteristics predicted by the generator network is very similar to the training sample set in statistical characteristics, therefore, the application of the generation countermeasure network in the aspect of network traffic generation is significant, for example, the patent application with the application publication number of CN109889452A entitled "network background traffic generation method and system based on conditional generation countermeasure network" discloses a background traffic generation method based on conditional generation countermeasure network (CGAN), the method fixedly fills all flow data packet samples collected in advance into M-dimensional vectors, builds a conditional generation countermeasure network, carries out iterative training on the conditional generation countermeasure network, and generating the simulated background traffic by a generator network of the training conditional generation countermeasure network, and then transmitting. However, the method has the disadvantages that each traffic data packet sample of the training sample set is filled to the fixed 1518 features after vectorization, all the traffic data packets of different types are used as a training sample set of the conditional generation countermeasure network, and in the process of obtaining the network background traffic generation result, the iterative training is performed on the conditional generation countermeasure network through a plurality of traffic data packets of different types, so that the convergence speed is slow, and the efficiency of generating the network background traffic is low.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, provides a network background traffic generation method based on generation of a confrontation network GAN, and aims to improve the efficiency of network background traffic generation on the premise of ensuring communication security.
In order to achieve the purpose, the technical scheme adopted by the invention comprises the following steps:
(1) obtaining a training sample set Xtrain
(1a) S traffic data packets B ═ B { B } including M network applications when a communication node communicates with the Internet by using a wireshark tool1,B2,...,Bs,...,BSEach network application corresponds to at least one traffic data packet, each traffic data packet corresponds to one network application, and each traffic data packet BsIncluding W characteristics, and labeling each network application category to obtain category label set R corresponding to M network applicationsclass={R1,R2,...,Rm,...,RMWherein M is more than or equal to 2, S is more than or equal to 5000, BsS is more than or equal to 1 and less than or equal to S, W is more than or equal to 2, R represents the S-th flow data packetmRepresenting a category label corresponding to the mth network application, wherein M is more than or equal to 1 and less than or equal to M;
(1b) for each flow data packet BsThe non-numerical characteristics are subjected to single-hot coding, and each flow data packet subjected to single-hot coding is normalized to obtain a flow data packet set subjected to preprocessing
Figure BDA0003162952940000021
Wherein
Figure BDA0003162952940000022
Is represented by BsThe result of the pretreatment of (1);
(1c) by RclassThe network application class label in (1) for each preprocessed traffic data packet
Figure BDA0003162952940000023
Is marked to obtain
Figure BDA0003162952940000024
Corresponding network application category label set y ═ y1,y2,...,ys,...ySAnd gathering the preprocessed flow data packets
Figure BDA0003162952940000031
And combining the corresponding network application class label sets y into a training sample set
Figure BDA0003162952940000032
Wherein y issRepresenting and pre-processing traffic packets
Figure BDA0003162952940000033
Corresponding Web application class tag, XRmRepresents XtrainThe network application class label is RmThe set of samples of (a) is,
Figure BDA0003162952940000034
indicating that the network application class label is RmV denotes XtrainThe network application class label is RmV is more than 0 and less than S, and V is more than or equal to 0 and less than or equal to V;
(2) constructing and generating an antagonistic network model library:
constructing a model library comprising M generative countermeasure networks of the same kind as the network applications
Figure BDA0003162952940000035
Each generating a countermeasure network
Figure BDA0003162952940000036
Comprising a network of generators cascaded in sequence
Figure BDA0003162952940000037
And arbiter network
Figure BDA0003162952940000038
Wherein the generator network
Figure BDA0003162952940000039
The system comprises an input layer, a first full-connection module and an output layer; arbiter network
Figure BDA00031629529400000310
Comprises an input layer, a second full-connection module and an output layer,
Figure BDA00031629529400000311
representing the generation countermeasure network corresponding to the mth network application;
(3) performing iterative training on the generated confrontation network model library:
(3a) initializing mth generative countermeasure network
Figure BDA00031629529400000312
Inclusion generator network
Figure BDA00031629529400000313
Has the parameters of
Figure BDA00031629529400000314
Arbiter network
Figure BDA00031629529400000315
Has the parameters of
Figure BDA00031629529400000316
The number of iterations is q1Maximum number of iterations is Q1,Q1Not less than 2000, and q is1=0;
(3b) From the network application class label RmSample set of
Figure BDA00031629529400000317
Randomly selecting K samples
Figure BDA00031629529400000318
As generating countermeasure networks
Figure BDA00031629529400000319
Of a network of input, generators
Figure BDA00031629529400000320
For each sample
Figure BDA00031629529400000321
ToProcessing the flow data packet characteristics for prediction to obtain a predicted flow data packet characteristic set
Figure BDA00031629529400000322
Arbiter network
Figure BDA00031629529400000323
Calculate each one separately
Figure BDA00031629529400000324
And each of
Figure BDA00031629529400000325
Derived from a sample set
Figure BDA00031629529400000326
To obtain a probability set
Figure BDA00031629529400000327
And probability set D2={d1,d2,...,dk,...,dKK is more than or equal to 1 and less than or equal to 50, K is more than or equal to 1 and less than or equal to K,
Figure BDA00031629529400000328
represents the k-th randomly selected sample,
Figure BDA00031629529400000329
to represent
Figure BDA00031629529400000330
Through a generator network
Figure BDA00031629529400000331
The characteristics of the traffic data packet obtained by prediction,
Figure BDA00031629529400000332
representation arbiter network
Figure BDA00031629529400000333
Computing
Figure BDA00031629529400000334
Derived from a sample set
Figure BDA00031629529400000335
Probability of (d)kRepresentation arbiter network
Figure BDA00031629529400000336
Computing
Figure BDA0003162952940000041
Derived from a sample set
Figure BDA0003162952940000042
The probability of (d);
(3c) using a cross-entropy loss function, by
Figure BDA0003162952940000043
Computation generator network
Figure BDA0003162952940000044
Loss of
Figure BDA0003162952940000045
At the same time pass
Figure BDA0003162952940000046
And dkComputation arbiter network
Figure BDA0003162952940000047
Loss of
Figure BDA0003162952940000048
And using a counter-propagating method by
Figure BDA0003162952940000049
Computation generator network
Figure BDA00031629529400000410
Gradient of network parameters by
Figure BDA00031629529400000411
Computing arbiter network
Figure BDA00031629529400000412
Network parameter gradient of (a); then using a gradient descent algorithm, by
Figure BDA00031629529400000413
Network parameter gradient pairs of
Figure BDA00031629529400000414
Network parameters of
Figure BDA00031629529400000415
Is updated by
Figure BDA00031629529400000416
Network parameter gradient pairs of
Figure BDA00031629529400000417
Parameter (d) of
Figure BDA00031629529400000418
Updating is carried out;
(3d) judging q1=Q1If yes, obtaining M trained generation countermeasure networks
Figure BDA00031629529400000419
Otherwise, let q1=q1+1, and performing step (3 b);
(4) obtaining a trained generator network
Figure BDA00031629529400000420
Predicting the characteristics of the flow data packet:
will train sample set XtrainAs a network of generators each trained
Figure BDA00031629529400000421
For each application category, labeled RmSample set of
Figure BDA00031629529400000422
Each sample in the flow data packet feature prediction method is subjected to preprocessing flow data packet feature prediction to obtain a predicted flow data packet feature set
Figure BDA00031629529400000423
Wherein
Figure BDA00031629529400000424
Indicates that the Web application tag is RmSample set of
Figure BDA00031629529400000425
The V samples are subjected to a predicted flow data packet characteristic set obtained through prediction,
Figure BDA00031629529400000426
representing a training sample set XtrainThe network application class label is RmSample set of
Figure BDA00031629529400000427
The v sample of (1)
Figure BDA00031629529400000428
Trained generator network
Figure BDA00031629529400000429
Predicting the obtained flow data packet characteristics;
(5) and a network flow generation result:
randomly selecting an application class label as R from the feature set A of the predicted flow data packetwPredicted traffic data packet feature set
Figure BDA00031629529400000430
And from
Figure BDA00031629529400000431
Randomly selecting L predicted flow data packet characteristics
Figure BDA00031629529400000432
The flow generator sets the characteristics of the data packet according to the predicted flow
Figure BDA00031629529400000433
Generating an initial traffic packet sequence c ═ { c ═ c1,c2,...cl,...cLAnd encrypting data to be sent by the communication node and embedding the encrypted data into each initial flow data packet clObtaining network traffic c ' ═ c ' including L traffic packets in which encrypted data is embedded '1,c′2,...c′l,...c′LTherein of
Figure BDA00031629529400000434
For predicting traffic data packet feature set
Figure BDA00031629529400000435
The characteristic of the first predicted flow data packet randomly selected from (c)lRepresenting an initial flow data packet, c 'generated by the flow generator according to the characteristic of the l predicted flow data packet'lRepresenting the flow data packet of the first initial flow data packet through the encryption data embedding, wherein L is more than or equal to 1 and less than or equal to L, L is more than or equal to V, and R is more than or equal to 1 and less than or equal to Rw≤RM
Compared with the prior art, the invention has the following advantages:
1. according to the invention, through the training sample set containing the network flow data packet characteristics of various network applications, iterative training is carried out on the model base consisting of a plurality of generation countermeasure networks with the same type as the network applications, one generation countermeasure network corresponds to one network application, the defect of low convergence speed caused by iterative training of a conditional generation countermeasure network through a plurality of different types of flow data packets in the prior art is avoided, and the efficiency of network background flow generation is effectively improved on the premise of ensuring the communication safety.
2. According to the model base of the generated countermeasure network, each generated countermeasure network comprises the generator network and the discriminator network which are sequentially cascaded, the structure is simple, the convergence speed of training can be improved, and the generation efficiency of the network background flow is further improved.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention.
Detailed Description
The invention is described in further detail below with reference to the figures and specific examples.
Referring to fig. 1, the present invention includes the steps of:
step 1) obtaining a training sample set Xtrain
Step 1a) in this embodiment, a wireshark tool is used to grab S traffic data packets B ═ B that include M kinds of network applications when a communication node performs internet communication1,B2,...,Bs,...,BSEach network application corresponds to at least one traffic data packet, each traffic data packet corresponds to one network application, and each traffic data packet BsIncluding W characteristics, and labeling each network application category to obtain category label set R corresponding to M network applicationsclass={R1,R2,...,Rm,...,RMIn which B issIndicating the s-th traffic packet, RmThe class label corresponding to the mth network application is represented, M is not less than 1 and not more than M, in this embodiment, S is 7553, M is 5, W is 8, the 5 network applications include an Http web page request, Wechat, OneNote, 163 mailbox, and a channel dictionary, and each traffic packet includes 8 features, which are a source port, a destination port, a network service type of a target host, a protocol type, a packet length, a packet arrival time interval, and a sliding window length, respectively.
Step 1B) for each traffic packet BsThe non-numerical characteristics are subjected to single-hot coding, and each flow data packet subjected to single-hot coding is normalized to obtain a flow data packet set subjected to preprocessing
Figure BDA0003162952940000061
Wherein
Figure BDA0003162952940000062
Is represented by BsIn this embodiment, the two characteristics of the protocol type and the network service type of the target host are subjected to one-hot coding;
the one-hot coding uses the state register to code the state, and the one-hot coding can convert the non-digital features which are difficult to learn for generating the countermeasure network model into the digital features which are easy to learn for generating the countermeasure network model, so that the training difficulty for generating the countermeasure network model is reduced.
The data normalization technology can simplify data operation, and can solve the problem of gradient explosion when the generated countermeasure network adjusts network parameters according to a gradient descent algorithm, thereby accelerating the convergence speed of the generated countermeasure network model.
Step 1c) by RclassThe network application class label in (1) for each preprocessed traffic data packet
Figure BDA0003162952940000063
Is marked to obtain
Figure BDA0003162952940000064
Corresponding network application category label set y ═ y1,y2,...,ys,...ySAnd gathering the preprocessed flow data packets
Figure BDA0003162952940000065
And combining the corresponding network application class label sets y into a training sample set
Figure BDA0003162952940000066
Wherein y issRepresenting and pre-processing traffic packets
Figure BDA0003162952940000067
The corresponding web application category label is used,
Figure BDA0003162952940000068
represents XtrainThe network application class label is RmThe set of samples of (a) is,
Figure BDA0003162952940000069
indicating that the network application class label is RmV denotes XtrainThe network application class label is Rm0 < V < S, 0. ltoreq. V. ltoreq.V, in this example
Figure BDA00031629529400000610
The total number of samples contained in (a) 3067,
Figure BDA00031629529400000611
the total number of samples contained in (a) is 2368,
Figure BDA00031629529400000612
the total number of samples contained in (a) is 903,
Figure BDA00031629529400000613
total number of samples contained in (1) is 453, XR5The total number of samples contained in (c) is 762.
Step 2), constructing and generating a confrontation network model library:
constructing a model library comprising M generative countermeasure networks of the same kind as the network applications
Figure BDA00031629529400000614
Each generating a countermeasure network
Figure BDA00031629529400000615
Comprising a network of generators cascaded in sequence
Figure BDA00031629529400000616
And arbiter network
Figure BDA00031629529400000617
Wherein the generator network
Figure BDA00031629529400000618
The system comprises an input layer, a first full-connection module and an output layer; arbiter network
Figure BDA00031629529400000619
Comprises an input layer, a second full-connection module and an output layer,
Figure BDA00031629529400000620
representing the generation countermeasure network corresponding to the mth network application;
wherein the generator network
Figure BDA0003162952940000071
The first full-connection module in the system comprises three full-connection layers which are sequentially stacked, wherein the activation functions are all leak-relu, and the number of the neurons is respectively 50, 30 and 30; the output layer contains 8 neurons, and the activation function is tanh;
arbiter network
Figure BDA0003162952940000072
The second full-connection module in (1) comprises the activation functions of leak-relu, the number of the neurons is respectively 100, 60 and 30, the output layer comprises 5 neurons, and the activation function is softmax.
And a leak-relu activation function is used, so that the problem that the network gradient disappears in the back propagation process is solved. The activation function of a common neural network learning is f (x) ═ x, whose derivative is constantly 1, which results in the disappearance of the gradient during back propagation. The leaky-relu activation function is used in the generation of the countermeasure network, so that the learning speed of the generation of the countermeasure network can be increased, and the training time for generating the countermeasure network is further shortened.
Step 3) generating an iterative training of the antagonistic network model library:
step 3a) initializing the mth generative countermeasure network
Figure BDA0003162952940000073
Inclusion generator network
Figure BDA0003162952940000074
Has the parameters of
Figure BDA0003162952940000075
Arbiter network
Figure BDA0003162952940000076
Has the parameters of
Figure BDA0003162952940000077
The number of iterations is q1Maximum number of iterations is Q1,Q112000, and q1=0;
Step 3b) applying a class label as R from the networkmSample set of
Figure BDA0003162952940000078
Randomly selecting K samples
Figure BDA0003162952940000079
As generating countermeasure networks
Figure BDA00031629529400000710
Of a network of input, generators
Figure BDA00031629529400000711
For each sample
Figure BDA00031629529400000712
Predicting the characteristics of the preprocessed traffic data packet to obtain a feature set of the preprocessed traffic data packet
Figure BDA00031629529400000713
Arbiter network
Figure BDA00031629529400000714
Calculate each one separately
Figure BDA00031629529400000715
And each of
Figure BDA00031629529400000716
Derived from a sample set
Figure BDA00031629529400000717
To obtain a probability set
Figure BDA00031629529400000718
And probability set D2={d1,d2,...,dk,...,dKK10 in this example, 1 ≦ K,
Figure BDA00031629529400000719
represents the k-th randomly selected sample,
Figure BDA00031629529400000720
to represent
Figure BDA00031629529400000721
Through a generator network
Figure BDA00031629529400000722
The characteristics of the traffic data packet obtained by prediction,
Figure BDA00031629529400000723
representation arbiter network
Figure BDA00031629529400000724
Computing
Figure BDA00031629529400000725
Derived from a sample set
Figure BDA00031629529400000726
Probability of (d)kRepresentation arbiter network
Figure BDA00031629529400000727
Computing
Figure BDA00031629529400000728
Derived from a sample set
Figure BDA00031629529400000729
The probability of (d);
step 3c) using a cross entropy loss function by
Figure BDA0003162952940000081
Computation generator network
Figure BDA0003162952940000082
Loss of
Figure BDA0003162952940000083
At the same time pass
Figure BDA0003162952940000084
And dkComputation arbiter network
Figure BDA0003162952940000085
Loss of
Figure BDA0003162952940000086
And adopting a back propagation method built in an Adam optimizer by
Figure BDA0003162952940000087
Computation generator network
Figure BDA0003162952940000088
Gradient of network parameters by
Figure BDA0003162952940000089
Computing arbiter network
Figure BDA00031629529400000810
Network parameter gradient of (a); the Adam optimizer then uses a gradient descent algorithm, by
Figure BDA00031629529400000811
Network parameter gradient pairs of
Figure BDA00031629529400000812
Network parameters of
Figure BDA00031629529400000813
Is updated by
Figure BDA00031629529400000814
Network parameter gradient pairs of
Figure BDA00031629529400000815
Parameter (d) of
Figure BDA00031629529400000816
Updating is carried out;
wherein there is a loss
Figure BDA00031629529400000817
And
Figure BDA00031629529400000818
the calculation formulas of (A) and (B) are respectively as follows:
Figure BDA00031629529400000819
Figure BDA00031629529400000820
step 3d) determining q1=Q1If yes, obtaining M trained generation countermeasure networks
Figure BDA00031629529400000821
Otherwise, let q1=q1+1, and performing step (3 b);
the method comprises the steps of establishing a model base formed by a plurality of generation countermeasure networks with the same network application types, and performing iterative training on the generation countermeasure network models in the model base in parallel through a training sample set containing network traffic data packet characteristics of various network applications. The defect of low convergence speed caused by iterative training of a conditional generation countermeasure network through a plurality of different types of flow data packets in the prior art is overcome.
Step 4) obtaining the trained generator network
Figure BDA00031629529400000822
Predicting the characteristics of the flow data packet:
will train sample set XtrainAs a network of generators each trained
Figure BDA00031629529400000823
For each application category, labeled RmSample set of
Figure BDA00031629529400000824
Each sample in the flow data packet feature prediction method is subjected to preprocessing flow data packet feature prediction to obtain a predicted flow data packet feature set
Figure BDA00031629529400000825
Wherein
Figure BDA00031629529400000826
Indicates that the Web application tag is RmSample set of
Figure BDA00031629529400000827
The V samples are subjected to a predicted flow data packet characteristic set obtained through prediction,
Figure BDA00031629529400000828
representing a training sample set XtrainThe network application class label is RmSample set X ofRmThe v sample of (1)
Figure BDA0003162952940000091
Is trainedGood generator network
Figure BDA0003162952940000092
Predicting the obtained flow data packet characteristics;
step 5), generating a result by the network flow:
randomly selecting an application class label as R from the feature set A of the predicted flow data packetwPredicted traffic data packet feature set
Figure BDA0003162952940000093
And from
Figure BDA0003162952940000094
Randomly selecting L predicted flow data packet characteristics
Figure BDA0003162952940000095
Using tarfen equal flow generator script to collect characteristic set according to predicted flow data packet
Figure BDA0003162952940000096
Writing a configuration file, and generating an initial flow data packet sequence c ═ c by a flow generator according to the configuration file1,c2,...cl,...cLAnd encrypting data to be sent by the communication node and then sequentially embedding the data into each initial flow data packet clObtaining network traffic c ' ═ c ' including L traffic packets in which encrypted data is embedded '1,c′2,...c′l,...c′LTherein of
Figure BDA0003162952940000097
For predicting traffic data packet feature set
Figure BDA0003162952940000098
The characteristic of the first predicted flow data packet randomly selected from (c)lRepresenting an initial flow data packet, c 'generated by the flow generator according to the characteristic of the l predicted flow data packet'lIndicating that the ith initial traffic packet has been embedded with encrypted dataFlow data packet, L is more than or equal to 1 and less than or equal to L, L is more than or equal to V, and R is more than or equal to 1 and less than or equal to Rw≤RMIn this embodiment, L is 10.
The foregoing description is only an example of the present invention and does not constitute any limitation to the present invention, and it will be apparent to those skilled in the art that various modifications and variations in form and detail may be made without departing from the principle of the present invention after understanding the content and principle of the present invention, but these modifications and variations are within the scope of the claims of the present invention.

Claims (3)

1. A network background traffic generation method based on generation of a countermeasure network GAN is characterized by comprising the following steps:
(1) obtaining a training sample set Xtrain
(1a) S traffic data packets B ═ B { B } including M network applications when a communication node communicates with the Internet by using a wireshark tool1,B2,...,Bs,...,BSEach network application corresponds to at least one traffic data packet, each traffic data packet corresponds to one network application, and each traffic data packet BsIncluding W characteristics, and labeling each network application category to obtain category label set R corresponding to M network applicationsclass={R1,R2,...,Rm,...,RMWherein M is more than or equal to 2, S is more than or equal to 5000, BsS is more than or equal to 1 and less than or equal to S, W is more than or equal to 2, R represents the S-th flow data packetmRepresenting a category label corresponding to the mth network application, wherein M is more than or equal to 1 and less than or equal to M;
(1b) for each flow data packet BsThe non-numerical characteristics are subjected to single-hot coding, and each flow data packet subjected to single-hot coding is normalized to obtain a flow data packet set subjected to preprocessing
Figure FDA0003162952930000011
Wherein
Figure FDA0003162952930000012
Is represented by BsThe result of the pretreatment of (1);
(1c) by RclassThe network application class label in (1) for each preprocessed traffic data packet
Figure FDA0003162952930000013
Is marked to obtain
Figure FDA0003162952930000014
Corresponding network application category label set y ═ y1,y2,...,ys,...ySAnd gathering the preprocessed flow data packets
Figure FDA0003162952930000015
And combining the corresponding network application class label sets y into a training sample set
Figure FDA0003162952930000016
Wherein y issRepresenting and pre-processing traffic packets
Figure FDA0003162952930000017
The corresponding web application category label is used,
Figure FDA0003162952930000018
represents XtrainThe network application class label is RmThe set of samples of (a) is,
Figure FDA0003162952930000019
Figure FDA00031629529300000110
indicating that the network application class label is RmV denotes XtrainThe network application class label is RmV is more than 0 and less than S, and V is more than or equal to 0 and less than or equal to V;
(2) constructing and generating an antagonistic network model library:
the construction includes the application category of the networkThe same M model libraries for generating the countermeasure network
Figure FDA00031629529300000111
Each generating a countermeasure network
Figure FDA00031629529300000112
Comprising a network of generators cascaded in sequence
Figure FDA0003162952930000021
And arbiter network
Figure FDA0003162952930000022
Wherein the generator network
Figure FDA0003162952930000023
The system comprises an input layer, a first full-connection module and an output layer; arbiter network
Figure FDA0003162952930000024
Comprises an input layer, a second full-connection module and an output layer,
Figure FDA0003162952930000025
representing the generation countermeasure network corresponding to the mth network application;
(3) performing iterative training on the generated confrontation network model library:
(3a) initializing mth generative countermeasure network
Figure FDA0003162952930000026
Inclusion generator network
Figure FDA0003162952930000027
Has the parameters of
Figure FDA0003162952930000028
Arbiter network
Figure FDA0003162952930000029
Has the parameters of
Figure FDA00031629529300000210
The number of iterations is q1Maximum number of iterations is Q1,Q1Not less than 2000, and q is1=0;
(3b) From the network application class label RmSample set of
Figure FDA00031629529300000211
Randomly selecting K samples
Figure FDA00031629529300000212
As generating countermeasure networks
Figure FDA00031629529300000213
Of a network of input, generators
Figure FDA00031629529300000214
For each sample
Figure FDA00031629529300000215
Predicting the characteristics of the preprocessed traffic data packet to obtain a feature set of the preprocessed traffic data packet
Figure FDA00031629529300000216
Arbiter network
Figure FDA00031629529300000217
Calculate each one separately
Figure FDA00031629529300000218
And each of
Figure FDA00031629529300000219
Derived from a sample set
Figure FDA00031629529300000220
To obtain a probability set
Figure FDA00031629529300000221
And probability set D2={d1,d2,...,dk,...,dKK is more than or equal to 1 and less than or equal to 50, K is more than or equal to 1 and less than or equal to K,
Figure FDA00031629529300000222
represents the k-th randomly selected sample,
Figure FDA00031629529300000223
to represent
Figure FDA00031629529300000224
Through a generator network
Figure FDA00031629529300000225
The characteristics of the traffic data packet obtained by prediction,
Figure FDA00031629529300000226
representation arbiter network
Figure FDA00031629529300000227
Computing
Figure FDA00031629529300000228
Derived from a sample set
Figure FDA00031629529300000229
Probability of (d)kRepresentation arbiter network
Figure FDA00031629529300000230
Computing
Figure FDA00031629529300000231
Derived from a sample set
Figure FDA00031629529300000232
The probability of (d);
(3c) using a cross-entropy loss function, by
Figure FDA00031629529300000233
Computation generator network
Figure FDA00031629529300000234
Loss of
Figure FDA00031629529300000235
At the same time pass
Figure FDA00031629529300000236
And dkComputation arbiter network
Figure FDA00031629529300000237
Loss of
Figure FDA00031629529300000238
And using a counter-propagating method by
Figure FDA00031629529300000239
Computation generator network
Figure FDA00031629529300000240
Gradient of network parameters by
Figure FDA00031629529300000241
Computing arbiter network
Figure FDA00031629529300000242
Network parameter gradient of (a); then using a gradient descent algorithm, by
Figure FDA00031629529300000243
Network parameter gradient pairs of
Figure FDA00031629529300000244
Network parameters of
Figure FDA00031629529300000245
Is updated by
Figure FDA00031629529300000246
Network parameter gradient pairs of
Figure FDA00031629529300000247
Parameter (d) of
Figure FDA00031629529300000248
Updating is carried out;
(3d) judging q1=Q1If yes, obtaining M trained generation countermeasure networks
Figure FDA00031629529300000249
Otherwise, let q1=q1+1, and performing step (3 b);
(4) obtaining a trained generator network
Figure FDA0003162952930000031
Predicting the characteristics of the flow data packet:
will train sample set XtrainAs a network of generators each trained
Figure FDA0003162952930000032
For each application category, labeled RmSample set of
Figure FDA0003162952930000033
Each sample in the flow data packet is subjected to preprocessing flow data packet characteristic prediction to obtain a prediction flowVolume packet feature set
Figure FDA0003162952930000034
Wherein
Figure FDA0003162952930000035
Indicates that the Web application tag is RmSample set of
Figure FDA0003162952930000036
The V samples are subjected to a predicted flow data packet characteristic set obtained through prediction,
Figure FDA0003162952930000037
Figure FDA0003162952930000038
representing a training sample set XtrainThe network application class label is RmSample set of
Figure FDA0003162952930000039
The v sample of (1)
Figure FDA00031629529300000310
Trained generator network
Figure FDA00031629529300000311
Predicting the obtained flow data packet characteristics;
(5) and a network flow generation result:
randomly selecting an application class label as R from the feature set A of the predicted flow data packetwPredicted traffic data packet feature set
Figure FDA00031629529300000312
And from
Figure FDA00031629529300000313
Randomly selecting L predictionsTraffic packet characterization
Figure FDA00031629529300000314
The flow generator sets the characteristics of the data packet according to the predicted flow
Figure FDA00031629529300000315
Generating an initial traffic packet sequence c ═ { c ═ c1,c2,...cl,...cLAnd encrypting data to be sent by the communication node and embedding the encrypted data into each initial flow data packet clObtaining network traffic c ' ═ c ' including L traffic packets in which encrypted data is embedded '1,c′2,...c′l,...c′LTherein of
Figure FDA00031629529300000316
For predicting traffic data packet feature set
Figure FDA00031629529300000317
The characteristic of the first predicted flow data packet randomly selected from (c)lRepresenting an initial flow data packet, c 'generated by the flow generator according to the characteristic of the l predicted flow data packet'lRepresenting the flow data packet of the first initial flow data packet through the encryption data embedding, wherein L is more than or equal to 1 and less than or equal to L, L is more than or equal to V, and R is more than or equal to 1 and less than or equal to Rw≤RM
2. The method for generating background traffic of a network based on the GAN of claim 1, wherein the GAN of step (2) is used to generate the anti-adversarial network
Figure FDA00031629529300000318
Wherein:
generator network
Figure FDA00031629529300000319
The first full-connection module in (1) comprises the activation functions of all leak-relu, and the number of the neurons is respectively50, 30 and 30; the output layer contains 8 neurons, and the activation function is tanh;
arbiter network
Figure FDA0003162952930000041
The second full-connection module in (1) comprises the activation functions of leak-relu, the number of the neurons is respectively 100, 60 and 30, the output layer comprises 5 neurons, and the activation function is softmax.
3. The method for generating network background traffic based on generating anti-GAN network as claimed in claim 1, wherein the passing in step (3c)
Figure FDA0003162952930000042
Computation generator network
Figure FDA0003162952930000043
Loss of
Figure FDA0003162952930000044
And by
Figure FDA0003162952930000045
And dkComputation arbiter network
Figure FDA0003162952930000046
Loss of
Figure FDA0003162952930000047
The calculation formulas are respectively as follows:
Figure FDA0003162952930000048
Figure FDA0003162952930000049
CN202110796467.1A 2021-07-14 2021-07-14 Network background flow generation method based on generation of confrontation network GAN Active CN113542271B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110796467.1A CN113542271B (en) 2021-07-14 2021-07-14 Network background flow generation method based on generation of confrontation network GAN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110796467.1A CN113542271B (en) 2021-07-14 2021-07-14 Network background flow generation method based on generation of confrontation network GAN

Publications (2)

Publication Number Publication Date
CN113542271A true CN113542271A (en) 2021-10-22
CN113542271B CN113542271B (en) 2022-07-26

Family

ID=78128004

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110796467.1A Active CN113542271B (en) 2021-07-14 2021-07-14 Network background flow generation method based on generation of confrontation network GAN

Country Status (1)

Country Link
CN (1) CN113542271B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114050972A (en) * 2022-01-13 2022-02-15 广东电网有限责任公司广州供电局 OTA upgrading method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109889452A (en) * 2019-01-07 2019-06-14 中国科学院计算技术研究所 Network context flow generation method and system based on condition production confrontation network
CN111651642A (en) * 2020-04-16 2020-09-11 南京邮电大学 Improved TEXT-GAN-based flow data set generation method
WO2020226696A1 (en) * 2019-12-05 2020-11-12 Huawei Technologies Co. Ltd. System and method of generating a video dataset with varying fatigue levels by transfer learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109889452A (en) * 2019-01-07 2019-06-14 中国科学院计算技术研究所 Network context flow generation method and system based on condition production confrontation network
WO2020226696A1 (en) * 2019-12-05 2020-11-12 Huawei Technologies Co. Ltd. System and method of generating a video dataset with varying fatigue levels by transfer learning
CN111651642A (en) * 2020-04-16 2020-09-11 南京邮电大学 Improved TEXT-GAN-based flow data set generation method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李杰: "基于生成对抗网络的网络流量特征伪装技术", 《计算机工程》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114050972A (en) * 2022-01-13 2022-02-15 广东电网有限责任公司广州供电局 OTA upgrading method

Also Published As

Publication number Publication date
CN113542271B (en) 2022-07-26

Similar Documents

Publication Publication Date Title
CN112163594B (en) Network encryption traffic identification method and device
CN112839034B (en) Network intrusion detection method based on CNN-GRU hierarchical neural network
CN109104441A (en) A kind of detection system and method for the encryption malicious traffic stream based on deep learning
CN111144470A (en) Unknown network flow identification method and system based on deep self-encoder
CN109241268B (en) Similar information recommendation method, device, equipment and storage medium
CN115277086B (en) Network background flow generation method based on generation of countermeasure network
Dao et al. Stacked autoencoder-based probabilistic feature extraction for on-device network intrusion detection
CN111245667A (en) Network service identification method and device
CN111565156A (en) Method for identifying and classifying network traffic
CN113726545B (en) Network traffic generation method and device for generating countermeasure network based on knowledge enhancement
CN116662184B (en) Industrial control protocol fuzzy test case screening method and system based on Bert
CN113542271B (en) Network background flow generation method based on generation of confrontation network GAN
CN115037805A (en) Unknown network protocol identification method, system, device and storage medium based on deep clustering
CN116684877A (en) GYAC-LSTM-based 5G network traffic anomaly detection method and system
CN113343235B (en) Application layer malicious effective load detection method, system, device and medium based on Transformer
CN114826776A (en) Weak supervision detection method and system for encrypted malicious traffic
CN117527391A (en) Encrypted flow classification method based on attention mechanism and one-dimensional convolutional neural network
CN116306780B (en) Dynamic graph link generation method
CN115622810B (en) Business application identification system and method based on machine learning algorithm
CN113474795A (en) Answering cognitive queries from sensor input signals
CN114979017B (en) Deep learning protocol identification method and system based on original flow of industrial control system
CN116684133A (en) SDN network abnormal flow classification device and method based on double-layer attention and space-time feature parallel fusion
CN110955765A (en) Corpus construction method and apparatus of intelligent assistant, computer device and storage medium
CN114282688B (en) Two-party decision tree training method and system
CN113111329B (en) Password dictionary generation method and system based on multi-sequence long-term and short-term memory network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant