CN114912486A

CN114912486A - Modulation mode intelligent identification method based on lightweight network

Info

Publication number: CN114912486A
Application number: CN202210503877.7A
Authority: CN
Inventors: 周福辉; 王锐韬; 王芮宇; 梁宏韬; 徐铭; 赵越; 吴启晖; 董超
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2022-05-10
Filing date: 2022-05-10
Publication date: 2022-08-16

Abstract

The invention discloses a modulation mode intelligent identification method based on a lightweight network, which mainly solves the problems of complex network scale, large parameter quantity, low prediction accuracy, low convergence rate and the like of the conventional modulation mode intelligent identification method. The method comprises the following implementation steps: acquiring signal data; establishing a category label; preprocessing I/Q data; building a network model according to the linear bottleneck structure and the reverse residual error structure; training a network model by using the training data; judging whether the network training is finished; inputting the test set data into a network; and outputting a test result. By adding the depth separable convolution, the linear bottleneck structure and the reverse residual error network, the invention not only greatly reduces the network scale, but also improves the convergence rate while ensuring the identification accuracy, and can be applied to the actual communication scene; compared with other traditional convolutional neural network model methods, the intelligent identification performance of the modulation mode is improved.

Description

Modulation mode intelligent identification method based on lightweight network

Technical Field

The invention belongs to the field of wireless communication, and particularly relates to a modulation mode intelligent identification method based on a lightweight network.

Background

Modulation mode intelligent identification is a vital part in a wireless communication network and is also one of main methods of cognitive radio. The intelligent identification of the modulation mode is based on signals received from different devices, and different modulation types of the signals are identified, so that the recovery of transmission information and the determination of a proper demodulation method are facilitated. Meanwhile, the method can improve the spectrum utilization efficiency under the condition of shortage of spectrum resources, and is an indispensable intelligent communication technology in wireless communication. The existing intelligent identification technology for the adjustment mode can be divided into two types, namely a model-driven method and a data-driven method. Model-based methods are further divided into likelihood function-based methods and feature-based methods. The method based on the likelihood function takes intelligent identification of the modulation mode as a hypothesis testing problem. Whereas feature-based methods aim at finding better features of the received signal. However, both methods are computationally complex and require full a priori knowledge.

In recent years, modulation classification has been performed based on a data-driven method, which sharply captures information relating to distinction between classes, directly from differences between raw data. Automatic modulation classification schemes based on deep learning, such as I/Q samples, have received a great deal of attention due to their superior accuracy. Two customized network architectures, VGG and Res, are proposed based on the RADIO ML2018.01A data set in the paper "Over-the-air de-arranged based radio Signal classification" (IEEE Journal of Selected pics in Signal processing, vol.12, No.1, pp.168-179,2018) published by O' shear T J, Roy T, Clanc T C, and take the in-phase part and the quadrature part (I/Q) of the received Signal as the input of the extracted features, and experiments prove that the precision is obviously improved under the data-driven method. However, these networks are large in size, and the number of parameters and the amount of computation are demanding on the computer equipment, exceeding the capabilities of many mobile and portable devices. A network architecture using deep convolution and dot convolution in lightweight network Mobilnets is proposed in a paper "sparse connected CNN for efficient automatic modulation" (IEEE Transactions on Vehicular Technology, vol.69, No.12, pp.15557-15568,2020.) published by Tunze G B, Huynh-The T, Lee J M et al. But the accuracy of the network is degraded.

The method has the contradiction problem between the accuracy and the complexity of the network, and is difficult to be practically utilized. How to balance the two methods is guaranteed, the complexity (i.e. training time, calculation cost and memory cost) in the mobile application program is reduced while the efficient deep learning architecture is deployed, the lightweight requirements of the future wireless communication network are met, and a new intelligent identification scheme of a modulation mode needs to be developed urgently.

Therefore, there is a need to improve the above-mentioned problems of the prior art to overcome the shortcomings of the prior art.

Disclosure of Invention

The invention provides a modulation mode intelligent identification method based on a lightweight network aiming at the contradiction between the accuracy of a network model and the requirement of network scale lightweight in the actual communication scene of the existing modulation mode intelligent identification technology, which ensures the prediction accuracy of the network and has higher network convergence speed while greatly reducing the network scale parameters.

In order to achieve the purpose, the modulation mode intelligent identification method based on the lightweight network comprises the following steps:

acquiring signal data, namely performing vector conversion processing on an in-phase part I and an orthogonal part Q of a received signal to form a modulation identification signal;

establishing a category label, and establishing a corresponding category label file according to the acquired modulation identification signal;

step three, I/Q data preprocessing is carried out;

step four, building a network model;

step five, training a network model by using training data;

step six, judging whether the network training is finished, if so, executing step seven, and if not, adding one to the training iteration times and then continuing to train the network in step five;

step seven, inputting the test set data into a network;

and step eight, outputting a test result.

Further, in step one, the modulation class of the received signal can be determined as a class K hypothesis test, and the received signal H under the K-th modulation hypothesis is determined _k Is composed of

H _k :x _k (n)＝s _k (n)+ω _k (n),n＝1,2,...,N

Wherein s is _k (n) denotes a transmission signal, x _k (N) represents the received signal, N is the number of signal symbols, ω _k (n) is zero mean and σ variance ² Is a white additive gaussian noise of (1),

will receive signal x _k (n) the in-phase part I and the quadrature part Q are directly input to the neural network without normalization by passing the received signal x _k (n) conversion to vector x _k The I/Q signal samples are represented as a vector, given as

Wherein, I _k And Q _k Representing the in-phase and quadrature parts of the received signal, respectively.

Further, in step three, after receiving the signal samples, the method utilizes a basisThe deep learning scheme extracts features from the original data, inputs the features into a full connection layer, and performs feature dimension conversion, wherein the feature learning is expressed by expressing the original data

The procedure of mapping to the L-dimensional vector y is given as

Wherein the mapping function f represents a feature learning model with fully connected layers,

denotes x _k In the vector space, y represents the feature vector of the fully-connected layer output,

representing the vector space in which y is located.

Further, the step four includes the following steps,

step 4a, in an input stage, converting original data of a signal sample into tensor, and matching the requirements of a deep learning framework;

step 4b, a network is built by utilizing the linear bottleneck structure and the reverse residual error structure, characteristics are extracted from the input signals, and the average pooling layer is utilized to adapt to the input length;

step 4c, setting width factor hyperparameters, and adjusting input and output dimensions so as to further adjust network parameter quantity;

further, in step 4b,

the network structure adopts a linear bottleneck structure and a reverse residual error structure, fully extracts data characteristics through the reverse residual error structure constructed by deep convolution and point-by-point convolution, reduces the parameter quantity and ensures the precision of the parameter quantity, uses batch normalization to accelerate the training convergence speed after each convolution layer, uses ReLU6 nonlinear function activation, and comprises one convolution layer, 14 linear bottleneck structures, the convolution kernel size is 3 multiplied by 3, and batch normalization is added in the training process,

first, using a convolution layer of 3 × 3 with a step size of 2, the feature dimension 2 is raised to 32 by deep convolution,

then, sequentially passing through 14 linear bottleneck structures, setting the expansion factor of each linear bottleneck structure as t and the input dimensionality as M, firstly increasing the dimensionality to t multiplied by M by using point-to-point convolution, then carrying out repeated training on the linear bottleneck structures, wherein only the step length in the first linear bottleneck structure is s when the repeated training is carried out, the rest default is 1, compressing the output dimensionality by point-to-point convolution, increasing the dimensionality to 128 by the linear bottleneck structures, increasing the dimensionality to 512 by a 1 multiplied by 1 convolution layer,

and finally, performing pooling operation by adopting an average pooling layer, reducing characteristic dimensionality, generating a classification result, inputting output dimensionality generated by the linear bottleneck structure into the average pooling layer, and averaging each output dimensionality of each corresponding channel in the last convolution layer, wherein the output dimensionality of the average pooling layer is Cx 1 x 1, and C is the number of output channels.

Further, in the step 4c, a width factor hyperparameter alpha is set, each convolution layer is uniformly refined, the value range of alpha is 0.4-0.75, and the calculated amount and the parameter amount of the whole network are reduced to the original alpha ² 。

Further, in step 4c, the value of α is preferably 0.5.

Furthermore, in the fifth step,

firstly, randomly initializing trainable parameters of the network, wherein the initial trainable parameters of the network are randomly initialized according to a monotonous decreasing rule because the sequence is always the highest in correlation degree with a data point closest to the current time, and the farther the distance is, the lower the correlation is;

secondly, inputting training data into the network for training in batches, wherein the batch size can be adjusted, the training error of each batch is propagated reversely so as to optimize network parameters, the error loss function adopts a cross entropy loss function, and the difference between the output of the model and the label of the real modulation mode is measured and can be written as

Wherein K represents the number of classes, N represents the number of samples, p _j，k Is the output of the Softmax function and represents the probability that the sample i belongs to class k, q _j,k To indicate a variable, given as

When the training data is completely trained once, the training data is 1 training iteration number.

The invention has the beneficial effects that:

first, since the present invention introduces an inverse residual structure and a linear bottleneck structure, and uses an average pooling layer instead of a conventional full-link layer, the network scale is greatly reduced and the parameter amount is greatly reduced compared to a conventional deep learning framework.

Secondly, as the linear bottleneck structure in the intelligent identification method of the modulation mode based on the lightweight network is composed of deep convolution and point-by-point convolution, the loss of data details can be avoided along with the depth of the network, the network parameters are greatly reduced, and the identification performance of the modulation mode is guaranteed.

And thirdly, introducing a super parameter of a width factor, and adjusting the dimensionality of input and output of the network, thereby further adjusting the network parameter quantity and precision and improving the flexibility of the network scale.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a diagram of a modulation scheme intelligent recognition network framework of the present invention;

FIG. 3 is a comparison of network size parameters using the present invention and other prior art techniques;

FIG. 4 is a graph comparing the prediction accuracy under different SNR conditions using the present invention and other prior art techniques;

fig. 5 is a graph comparing the convergence rate of training using the present invention and other prior art techniques.

Detailed Description

The invention will be further explained with reference to the drawings.

The specific steps of the method of the present invention are described below with reference to FIG. 1:

step 1, signal data acquisition and processing.

The modulation classification may be determined as a class K hypothesis test, where K represents the number of modulation schemes. Received signal H under k-th modulation hypothesis _k Is composed of

H _k :x _k (n)＝s _k (n)+ω _k (n),n＝1,2,...,N

Wherein s is _k (n) and x _k And (n) denotes a transmission signal and a reception signal, respectively. And N is the number of signal symbols. Omega _k (n) is zero mean and σ variance ² White additive gaussian noise.

The in-phase part and the quadrature part (I/Q) of the received signal are utilized simultaneously, these two parts obeying the same independent distribution and can be directly input to the network model in step 4 without normalization by passing the received signal x _k (n) conversion to vector x _k The I/Q signal samples can be represented as a vector, given as

In English, Real number is the first two letters "Re" of Real to represent the Real part of a complex number, Imaginary number is the first two letters "Im" of Imaginary to represent the Imaginary part of a complex number, and I _k And Q _k Representing the in-phase and quadrature parts of the received signal, respectively, the raw data x _k Can be specifically expressed in the form of a matrix, given as

Wherein, N is the number of signal symbols.

And 2, establishing a corresponding class label file according to the acquired modulation identification signal.

Eight different digital modulation modes are adopted for training and testing, namely 8PSK, BPSK, CPFSK, GFSK, PAM4,16QAM,64QAM and QPSK, and the signal noise is Additive White Gaussian Noise (AWGN).

And 3, preprocessing the I/Q data.

After receiving a large number of signal samples, features are extracted from the raw data using deep learning based CNN and RNN schemes. Then, the extracted features are input to a full connection layer (FC), information integration is performed, and then feature dimensions are converted. Feature learning can be expressed as raw data

The procedure of mapping to the L-dimensional vector y is given as

representing the vector space in which y is located.

And 4, building a network model.

In the input stage, the raw data of the signal samples are converted into tensors, and the visual feature space is used as an embedding space to match the requirements of the deep learning framework.

Both the linear bottleneck structure and the inverse residual network use deep separable convolution to reduce the computational effort. Using depth separable convolution breaks the interaction between the number of output channels and the convolution kernel size. The depth separable convolution consists of two parts: depth convolution and point-by-point convolution. Deep convolution atA single filter is applied on each input channel, followed by point-by-point convolution to create a linear combination of depth outputs. If the number of input channels is M and the number of output channels is C, the size of convolution kernel is D _k ×D _k And a feature map size of D _F ×D _F The computation of the depth separable convolution is that of the standard convolution:

the linear bottleneck structure is evolved from a bottleneck structure, is used for MobileNet V2 for the first time, and is subjected to dimensionality increase by using point-to-point convolution in the first layer of the structure, wherein the number of input channels is M, and the expansion factor is t; the second layer performs a deep convolution using a convolution kernel of size 3 × 3; and the third layer uses point-by-point convolution again, correlates the features after the deep convolution and outputs the specified channel number C. The linear bottleneck structure is used in two cases, the first is to use the residual structure when the step size is 1, and the second is to use no residual structure when the step size is 2. By the linear bottleneck structure, the number of parameters and convolution calculation amount are effectively reduced compared with the standard convolution. The network is optimized both spatially and temporally.

The inverse residual structure is optimized based on the residual structure (Residuals) in the ResNet, and actually increases the residual propagation based on the linear bottleneck structure. The inverse residual structure uses a first layer of point-by-point convolution to increase dimension, then uses deep convolution, and then uses point-by-point convolution to decrease dimension, which is exactly the opposite of the residual structure in ResNet. Such convolution operations are more advantageous in reducing the number of parameters and the amount of computation.

Based on the two structures and with the attached figure 2, a network structure for automatic modulation recognition is built. The network architecture contains one convolutional layer, followed by 14 linear bottleneck structures as depicted in fig. 2. We used ReLU6 as the nonlinear activation function. All of the above convolution kernels are 3 x 3 in size and batch normalization is added during the training process.

First, the feature dimension 2 is raised to 32 using a convolution layer of 3 × 3 with a step size of 2. Then, 14 linear bottleneck structures are passed in turn. Assuming that the expansion factor of each linear bottleneck structure is t, the dimension is increased to t × M by using point-by-point convolution, and then the linear bottleneck structures are repeated, wherein only the step length in the first linear bottleneck structure is s during the repetition, and the rest are defaulted to 1. And performing point-by-point convolution after the deep convolution to compress the compressed image to an output dimension. The linear bottleneck structure increases the number of dimensions to 128. In the process, data features are fully extracted through an inverse residual error structure constructed by deep convolution and point-by-point convolution, the parameter quantity is reduced, and the precision of the parameter quantity is guaranteed. After each convolutional layer, the training convergence speed was increased using Batch Normalization (BN) and activated using the ReLU6 nonlinear function. The convolution operation in the linear bottleneck structure is as follows:

for a linear bottleneck structure with h × w size, t expansion factor, k kernel size and M number of input channels and C number of output channels, the required calculation amount is:

h·w·M·t(M+k ² +C)

compared with the traditional full connection layer, the method adopts an average pooling layer (GAP) to perform pooling operation to reduce feature dimensionality and generate a classification result. And inputting the feature maps generated by the previous linear bottleneck structures into an average pooling layer, and averaging each feature map of each corresponding channel in the last convolutional layer. The output dimension of the average pooling layer is C × 1 × 1. The method can effectively relieve the occurrence of the over-fitting problem.

Finally, a width factor hyperparameter alpha is introduced, which has the effect of refining each network uniformly at each layer. At this time, the number of input channels is α M, the number of output channels is α C, and the amount of computation of the depth separable convolution at this time is:

D _k ·D _k ·αM·D _F ·D _F +αM·αC·D _F ·D _F

alpha is generally set to be 0.4-0.75, and the calculated amount and the parameter amount of the whole network can be reduced by alpha ² Smaller, faster models can be built more flexibly. We have found thatThe net of (2) finally selects 0.5 as the width factor, which can significantly reduce the parameter number and keep higher precision.

And 5, training the network model by using the training data.

Firstly, the trainable parameters of the network are initialized randomly, and because the sequence is always related to the data point closest to the current time to the highest degree, and the farther the distance is, the lower the relevance is, the initial trainable parameters of the network are initialized randomly according to a monotonous decreasing rule. The number of iterations epoch of the initial network training is equal to 1, the maximum number of iterations is 40, and the learning rate is 0.01.

And secondly, inputting the training data into the network for training in batches, wherein the batch size can be adjusted, and the training error of each batch is propagated reversely so as to optimize the network parameters. The error loss function adopts a cross entropy loss function, measures the difference between the output of the model and the label of the real modulation mode, and can be written as

Where K represents the number of classes and N represents the number of samples. p is a radical of _i,k Is the output of the Softmax function and represents the probability that a data sample i belongs to class k. q. q.s _i,k Is an indicator variable given as

When the training data is completely trained once, the training data is 1 epoch.

And 6, judging whether the network training is finished.

And judging whether the current epoch reaches the maximum epoch, if so, performing the step 7, otherwise, adding one to the epoch, and continuing to perform the step 5 to train the network.

And 7, inputting the test set data into a network.

And 8, outputting a test result.

The effect of the present invention will be further explained with the simulation experiment.

1. Simulation conditions and parameter setting:

the simulation experiment of the invention is carried out on a simulation platform of Python3.6 and Pytroch 1.5.0. The computer CPU model is Intel core i5, and is equipped with an independent display card with the model of Inviet geforceGTX 1650. In an embodiment of the invention, we used the public data set radioml2016.10a to evaluate the results. The data is generated by GNU Radio and consists of 11 adjustment modes (8 digital modulation methods and 3 analog modulation methods), the signal-to-noise ratio range is from-20 dB to 20dB, and the step length is 2 dB. Each modulation type has 1000 samples at each different signal-to-noise ratio, with 128 points per sample. The entire data set consisted of 220000I/Q vectors in total.

8 digital adjustment modes in the data set are selected, the signal-to-noise ratio range is from-6 dB to 14dB, 88000 vectors are used in total, wherein 80% of the vectors are training sets, and 20% of the vectors are testing sets. The SGD training model is used, the initial learning rate is 0.01, the weight attenuation is 5 x 10 < -4 >, and the maximum iteration number is 40.

2. Simulation content:

figure 3 is a graph comparing network size parameters and calculations using the present invention and prior art. Params size, Flops (calculated amount) and Madd calculated amount are selected for comparison. In the number of visible parameters, the method is far smaller than a convolutional neural network, and has advantages compared with a residual error network (ResNet) and a visual geometric neural network (VGG), and the number of the residual error network is only 0.4 Mb; the Madd calculation amount is 2.22Mb, the number of floating point operations per second is 1.15M, and the number of floating point operations is 0.01% of that of the convolutional neural network, 18% of that of the residual network and 30% of that of the visual geometric neural network. Compared with other modulation identification schemes, the network realizes light weight, and both network parameters and calculated amount are greatly reduced.

FIG. 4 is a graph comparing modulation recognition accuracy under different SNR conditions using the present invention and the prior art. The abscissa in fig. 4 represents different signal-to-noise ratios (dB), and the ordinate represents the accuracy of the modulation scheme identification. The broken line marked by a star represents the modulation recognition accuracy curve of the method, the broken line marked by a circle represents the modulation recognition accuracy curve of the visual geometric neural network under different signal to noise ratios, the broken line marked by a fork represents the modulation recognition accuracy curve of the residual error network under different signal to noise ratios, and the broken line marked by a square represents the modulation recognition accuracy curve of the convolutional neural network under different signal to noise ratios. The signal-to-noise ratio varied from-6 dB to 14dB, and simulation experiments were conducted using the method of the present invention and the prior art. By comparing the accuracy of the modulation identification method obtained by the four methods, the prediction accuracy of the method is obviously higher than that of the traditional method. From the general trend, the prediction accuracy increases as the signal-to-noise ratio increases. Under the environment of low signal-to-noise ratio, the method is about 2dB better than other technical methods. When the signal-to-noise ratio is-6 dB, the prediction accuracy of the method reaches about 55%, which exceeds the accuracy of other methods by about 15%; the accuracy increase is also evident as the signal-to-noise ratio increases. At the signal-to-noise ratio of 0dB, the prediction accuracy of the method reaches the maximum value, and reaches about 93%, which exceeds the accuracy of the method based on the visual geometric neural network and the convolutional neural network by about 15% and exceeds the accuracy of the residual error network by about 5%. Under the environment of high signal-to-noise ratio, the accuracy of the invention can reach 95% at most, the accuracy of the invention exceeds the accuracy of the visual geometric neural network and the convolutional neural network by about 10%, and the accuracy of the invention is higher than the accuracy of the residual error network by about 2%.

Fig. 5 is a graph comparing the training speed of the network using the present invention and the prior art. In fig. 5, the abscissa represents the number of times of training (times), and the ordinate represents the loss function value. The broken line marked by star represents the training speed convergence curve of the method, the broken line marked by circle represents the training speed convergence curve of the visual geometric neural network, the broken line marked by fork represents the training speed convergence curve of the residual error network, and the broken line marked by square represents the training speed convergence curve of the convolutional neural network. By comparing the training speed convergence curves obtained by the four methods, the training speed of the method is obviously higher than that of the existing intelligent identification method of the modulation mode. The inventive method achieves convergence only at about 10 epochs, whereas the methods based on the ocular geometric neural network and the convolutional neural network achieve convergence at about 20 epochs.

By integrating the simulation results and analysis, the modulation mode intelligent identification method based on the lightweight network provided by the invention can greatly reduce the network scale and the calculated amount compared with the existing method, meanwhile, the performance of modulation mode identification is guaranteed, and the convergence speed is higher, so that the method can be better applied to the actual communication scene.

The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.

Claims

1. A modulation mode intelligent identification method based on a lightweight network is characterized by comprising the following steps:

step three, I/Q data preprocessing is carried out;

step four, building a network model;

step five, training a network model by using training data;

step seven, inputting the test set data into a network;

step eight, outputting a test result.

2. The method for intelligently identifying a modulation scheme based on a lightweight network as claimed in claim 1, wherein in step one, the received signal modulation classification is madeWith the received signal H set under the K-th modulation hypothesis determined as a class K hypothesis test _k Is composed of

H _k ：x _k (n)＝s _k (n)+ω _k (n)，n＝1，2，...，N

Wherein s is _k (n) denotes a transmission signal, x _k (N) represents the received signal, N is the number of signal symbols, ω _k (n) is zero mean and σ variance ² Is to receive the signal x _k (n) the in-phase part I and the quadrature part Q are directly input to the neural network without normalization by passing the received signal x _k (n) conversion to vector x _k The I/Q signal samples are represented as a vector, given as

3. The method as claimed in claim 1, wherein in step three, after receiving the signal samples, features are extracted from the original data by a deep learning-based scheme, and the extracted features are input into a full connection layer to perform feature dimension conversion, and feature learning is expressed by converting feature dimensions of the original data

The procedure of mapping to the L-dimensional vector y is given as

denotes x _k In the vector space, y represents the feature vector output by the fully-connected layer,

representing the vector space in which y is located.

4. The intelligent recognition method for modulation modes based on the lightweight network as claimed in claim 1, wherein the fourth step comprises the following steps,

and 4c, setting a width factor hyperparameter, and adjusting input and output dimensions so as to further adjust the network parameter quantity.

5. The intelligent recognition method for modulation schemes based on lightweight networks, according to claim 4, characterized in that in step 4b,

6. The method for intelligently identifying the modulation mode based on the lightweight network as claimed in claim 4, wherein in the step 4c, a width factor hyperparameter α is set, each convolution layer is uniformly refined, the value range of α is 0.4-0.75, and the calculated amount and the parameter amount of the whole network are reduced to the original α ² 。

7. The method for intelligently identifying the modulation mode based on the lightweight network as claimed in claim 6, wherein in step 4c, the value of α is 0.5.

8. The intelligent identification method for the modulation scheme based on the lightweight network according to claim 1, characterized in that in step five,

Wherein K represents the number of classes, N represents the number of samples, p _j，k Is the output of the Softmax function and represents the probability that the sample i belongs to class k, q _j，k To indicate a variable, given as

When training data is completely trained once, the training iteration number is 1.