CN110569928B

CN110569928B - Micro Doppler radar human body action classification method of convolutional neural network

Info

Publication number: CN110569928B
Application number: CN201910897354.3A
Authority: CN
Inventors: 叶文彬; 陈海权
Original assignee: Shenzhen University
Current assignee: Shenzhen University
Priority date: 2019-09-23
Filing date: 2019-09-23
Publication date: 2023-04-07
Anticipated expiration: 2039-09-23
Also published as: CN110569928A

Abstract

The invention discloses a micro Doppler radar human body action classification method of a convolutional neural network, which comprises an original data processing process and a deep convolutional neural network, wherein the output of the original data processing process is connected to the deep convolutional neural network, the original data processing process is a one-dimensional convolutional process, and the deep convolutional neural network comprises a plurality of multi-scale intensive connection modules with pooling and a full connection layer; the system comprises a plurality of multi-scale intensive connection modules with pooling, a first multi-scale intensive connection module with pooling, a full connection layer and a classification label, wherein the multi-scale intensive connection modules with pooling are sequentially connected in series, the input of an original data processing process is original radar data, the input end of the first multi-scale intensive connection module with pooling is connected with the output end of the original data processing process, the output end of the last multi-scale intensive connection module with pooling is connected with the full connection layer, and the full connection layer outputs the classification label. The neural network has the advantages of simple structure, less parameter quantity, small calculated quantity, high operation speed and higher accuracy.

Description

Micro Doppler radar human body action classification method of convolutional neural network

[ technical field ]

The invention relates to a human body action recognition technology, in particular to a micro Doppler radar human body action classification method of a convolutional neural network.

[ background art ]

Human activity recognition is used in many fields such as elderly monitoring, non-contact medical monitoring and human-computer interaction. Micro-doppler radar is one of the common sensors used to capture human motion signals, which has penetration capability and is not affected by the environment, such as light, weather, etc. Human actions such as running, boxing and walking can produce micro-doppler effect, which can be used to characterize the actions of the human body. Therefore, how to extract and analyze the radar micro-doppler features becomes a hot spot of research in recent years.

In the past decades, methods for analyzing radar micro-doppler characteristics to perform human motion recognition can be generally classified into two categories. The first type is a machine learning based approach. Such methods typically include two steps: and (5) extracting and classifying the features. The feature extraction method includes direct extraction from the original signal, but omits time information, and another method is to perform short-time fourier transform on the original signal and then extract envelope information of a spectrogram as the features of the original signal. And the extracted features are sent to a support vector machine, K nearest neighbor classifiers and the like for classification. However, the features extracted by manual extraction are not optimal, and people begin to classify by deep learning. And directly taking the spectrogram as the input of a convolutional neural network to extract and classify the features.

The invention with the application number of CN201710325528.X discloses a convolutional neural network human body action classification method based on radar simulation images, which comprises the following steps of: establishing a time-frequency image data set containing various human body actions; radar time frequency image data enhancement; establishing a convolutional neural network model: on the basis of a handwriting recognition network LeNet, introducing a modified linear unit ReLU to replace an original Sigmoid activation function as an activation function of a convolutional network on the basis of 3 convolutional layers, 2 pooling layers and 2 full-connection layers, adding one pooling layer, reducing one full-connection layer to form a convolutional neural network structure, wherein the convolutional neural network structure comprises 3 convolutional layers, 3 maximum pooling layers and 1 full-connection layer, and adjusting an interlayer structure, an in-layer structure and training parameters of the network to achieve a better classification effect; and training a convolutional neural network model.

First, although spectrograms show the physical relationship of human motion radar signals, this representation is separate from the learning of neural network models, and spectrograms are not a representation optimized for classification tasks. The STFT (short time fourier transform) of a radar signal can be considered as a one-dimensional convolution with fixed fourier coefficients. Second, by converting the radar signal to a specific domain using an STFT or a trainable convolution process, a two dimensional matrix may be obtained. Existing deep learning methods treat such two-dimensional matrices as conventional optical images and classify them using two-dimensional convolutional neural networks. The pixels of the optical image have a high spatial correlation, whereas the pixels of the two-dimensional matrix of the radar have more temporal correlation. Therefore, the 2D matrix is regarded as an optical image and then classified by using a two-dimensional convolutional neural network, so that the network parameters are large, the calculation amount is large, and the operation speed is low.

[ summary of the invention ]

The invention aims to provide a micro Doppler radar human body action classification method of a convolutional neural network, which has the advantages of less network parameters, small calculated amount and high operation speed.

In order to solve the technical problems, the invention adopts the technical scheme that the micro Doppler radar human body action classification method of the convolutional neural network comprises an original data processing process and a deep convolutional neural network, wherein the output of the original data processing process is connected to the deep convolutional neural network, the original data processing process is a one-dimensional convolutional process, and the deep convolutional neural network comprises a plurality of multi-scale intensive connection modules with pooling and a full connection layer; the input end of the first multi-scale intensive connection module with pooling is connected with the output end of the original data processing process, and the output end of the last multi-scale dense connection module with the pooling is connected with a full connection layer, and the full connection layer outputs the classification label.

In the above method for classifying human body actions by using a doppler radar of a convolutional neural network, the original data processing process includes two one-dimensional convolutional layers and a maximum pooling layer which are sequentially connected in series, the number of channels of the first one-dimensional convolutional layer is n, and the number of channels of the second one-dimensional convolutional layer and the number of channels of the maximum pooling layer are 2n; the first one-dimensional convolutional layer inputs the original radar data, and the output end of the third one-dimensional convolutional layer is connected with the first multi-scale dense connection module with pooling; where n is the initial number of channels.

According to the micro Doppler radar human body action classification method of the convolutional neural network, the multi-scale dense connection module with pooling comprises the multi-scale dense connection module and the maximum pooling layer which are connected in front and at back, and the output of the multi-scale dense connection module is connected with the maximum pooling layer; the number of the full-connection layers is two, the two full-connection layers are connected in series, random inactivation is implemented between the maximum value pooling layer of the last multi-scale dense connection module with pooling and the first full-connection layer, and the probability of the random inactivation is 0.6; the modified linear element function is used as the activation function in the first fully-connected layer, and the second fully-connected layer uses the normalized exponential function as the activation function for the final classification.

According to the micro Doppler radar human body action classification method of the convolutional neural network, the multi-scale dense connection module comprises 4 convolutional branches, a straight-through branch and a splicing layer, the 4 convolutional branches and the straight-through branch are connected between the input end of the multi-scale dense connection module and the splicing layer at the rear part in parallel, and the splicing layer splices the four convolutional branches and the input together to serve as the output of the multi-scale dense connection module.

According to the micro Doppler radar human body motion classification method of the convolutional neural network, the first convolutional branch and the second convolutional branch respectively comprise two one-dimensional convolutional layers which are connected in series, and the third convolutional branch comprises an average value pooling layer and a one-dimensional convolutional layer which are connected in series in the front-back sequence; the fourth convolution branch includes a one-dimensional convolution layer.

In the above micro doppler radar human body motion classification method for the convolutional neural network, the number of the first convolution branch first one-dimensional convolutional layer convolution kernels is 1, the number of channels is (1/8) k, the number of the first convolution branch second one-dimensional convolutional layer convolution kernels is 5, and the number of channels is (i/16) k; the number of the first one-dimensional convolutional layer convolutional cores of the second convolutional branches is 1, the number of channels is (1/8) k, the number of the second one-dimensional convolutional layer convolutional cores of the second convolutional branches is 3, and the number of channels is ((8-i /) 16) k; the size of the third convolution branch average pooling layer sample is 3, the number of the third convolution branch one-dimensional convolution layer convolution kernels is 1, and the number of channels is (1/4) k; the number of the one-dimensional convolution layer convolution kernels of the fourth convolution branch is 1, and the number of channels is (1/4) k; where k is the sum of the number of 4 convolution branch channels and i is a parameter controlling the number ratio between convolution kernels of 3 and 5.

According to the micro Doppler radar human body motion classification method of the convolutional neural network, the deep convolutional neural network comprises a plurality of pooled multi-scale dense connection modules, the number of channels of a splicing layer of the first pooled multi-scale dense connection module and the number of channels of a maximum pooled layer are 2n + k according to the front-back sequence, and the number of channels of the splicing layer and the number of channels of the maximum pooled layer are increased by k every time the pooled multi-scale dense connection module is added; wherein n is the initial number of channels.

In the micro doppler radar human body motion classification method of the convolutional neural network, the deep convolutional neural network comprises three multi-scale dense connection modules with pooling, the values of i of the three multi-scale dense connection modules with pooling are respectively set to 4,6 and 7 from front to back, in the multi-scale dense connection modules, the step lengths of all convolution and pooling operations are all 1, and zero padding is used to keep the same shape as the input.

According to the micro Doppler radar human body action classification method of the convolutional neural network, all elements are flattened into a row in the maximum value pooling layer of the last pooled multi-scale dense connection module, then classification is carried out through the full connection layer, and classification labels are output.

In the micro doppler radar human body motion classification method of the convolutional neural network, in the first convolutional layer in the original data processing process, the size of the convolutional kernel is 51, the step length is 2, and the number n of channels is 64; the size of the convolution kernel of the second convolution layer is 9, and the step length is 2; the maximum pooling layer size is 4 and the step size is also 4.

The neural network has the advantages of simple structure, less parameter quantity, small calculated quantity, high operation speed and higher accuracy.

[ description of the drawings ]

The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.

Fig. 1 is a schematic diagram of human activity data acquisition according to an embodiment of the present invention.

In fig. 1, (a) running, (b) walking, (c) gun lifting and walking, (d) crawling, (e) walking boxing, (f) standing boxing, (g) sitting still.

FIG. 2 is a general architecture diagram of ID-1D-CNN according to an embodiment of the present invention.

In fig. 2, (1) one-dimensional convolutional layer, (2) maximum value pooling layer, (3) average value pooling layer, (4) splice layer, (5) full connection layer, and (6) label.

Fig. 3 is a Block diagram showing the overall structure of ID-Block according to the embodiment of the present invention.

In fig. 3, (1) one-dimensional convolution layer, and (2) average-value pooling layer.

FIG. 4 is a graph illustrating the effect of different amounts of training data on the accuracy of CAE and ID-1D-CNN according to an embodiment of the present invention.

FIG. 5 is a graph illustrating the effect of data acquisition time on accuracy in accordance with an embodiment of the present invention.

[ detailed description of the invention ]

The micro Doppler radar human body action classification method of the one-dimensional convolution neural network comprises the following steps:

as shown in fig. 1, class 7 human daily motion data were collected in free space using an english-flying Sense2GoL micro-doppler radar, for a total of 14,923 groups. The 7 types of human daily actions include: running (2075 group), walking (2367 group), lifting gun and walking (2064 group), crawling (1972 group), walking boxing (1967 group), standing boxing (2492 group), and sedentary (2049 group). Data collected were from seven experimenters, including five males and two females. A schematic of these seven actions, beam width and target range is shown in fig. 1. As can be seen from fig. 1, the beam width in the vertical direction of the radar is 20, and the beam width in the horizontal direction is 42. The measurement range is between 0.5 and 5 meters. A detailed description of class 7 human daily actions is shown in table 1. The sampling frequency of the radar is 2000Hz and the acquisition duration of each set of actions is 3 seconds. Each set of data was normalized to mean 0 and variance 1.

TABLE 1 detailed description of seven classes of human daily actions

And (5) building an end-to-end one-dimensional convolutional neural network ID-1D-CNN. The overall architecture of the ID-1D-CNN of the present invention is shown in FIG. 2. To reduce memory usage, ID-Block is used for higher layers, while lower layers use conventional convolution for extracting the main features. The input to the ID-1D-CNN is one-dimensional raw radar data, which is then connected through two one-dimensional convolutional layers and a maximum-pooling layer. In particular, techniques for converting large-scale two-dimensional convolution kernels into several small convolution kernels (e.g., 3 x 3) are used in inclusion-v 3 and inclusion-ResNet to reduce parameters and increase depth. However, this technique is not applicable to the 1D convolution layer of the present invention because it increases the number of parameters.

In the ID-1D-CNN framework of the invention, the convolution kernel size of the first two layers of the Stem module (primary module) is far larger than 3. The second convolutional layer has twice the number of channels as the first layer because the subsequent pooling layer downsamples the layer output. The three multi-scale dense connection modules (ID-blocks) are respectively connected with a maximum value pooling layer to form a pooled multi-scale dense connection module, and in the pooled multi-scale dense connection modules, each ID-Block is connected with a maximum value pooling layer, so that the size of a feature map can be reduced, and more abstract features can be extracted. The structure and parameters of these three ID-blocks are slightly different and the details will be described below. And after the final maximum value pooling layer of the multi-scale dense connection module with pooling, flattening all elements into a column, classifying through a full connection layer, and outputting a classification label.

Among the ID-1D-CNN shown in FIG. 2, there are three ID-blocks including ID-Block (a), ID-Block (b) and ID-Block (c). Where n and k are the number of channels in the network layer. Where n is the number of channels in the first layer convolutional layer (initial number of channels). k is the number of channels added after the ID-Block is added.

In the ID-1D-CNN of the present invention, ID-Block is intended to improve network performance. The following techniques are applied to the ID-Block of the present invention:

inspired by GoogleLeNet, a multi-scale technique was applied to the proposed ID-1D-CNN. To construct wider and deeper CNNs, a multi-convolution branch structure is used, each convolution branch representing a different filter size, meaning a different size of receptive field. Through training of the network, the network can automatically select the proper convolution kernel size in the initial module.

The method of dense connection is also used in the network proposed by the present invention. The dense connection technology can solve the problem of gradient disappearance, increase reuse of characteristics and improve the performance of the network.

In order to further reduce the number of parameters and increase the Network depth, network-in-Network (NiN) technology is applied. In NiN, the channels of the feature map are compressed by a 1 × 1 convolution, thereby reducing the parameters of the network and increasing the depth of the network.

The meaning of the formula in the block diagram as a × k is where a is the size of the convolution kernel. k is the total number of channels of the four convolution branches. The ratio of the number of channels at the end of each convolution branch is i/16, (8-i)/16,1/4 and 1/4. The number 3 in the pooling layer is the size of the sample.

As shown in fig. 3. There are four convolutional branches and one pass-through branch in the ID-Block. The back of three of the convolution branches is the convolution layer, with convolution kernel sizes of 1,3 and 5, respectively. The NiN technique is typically used for two-dimensional convolution, which the present invention uses for one-dimensional convolution and serves the same purpose. Thus, the convolution branches with convolution kernels of

size

3 and 5 are preceded by a convolution layer with convolution kernel size 1, intended for channel compression to avoid dimensional disasters. For the same reasons as before, a size 5 kernel does not break down into two size 3 kernels. The pooling layer plays an important role in the convolutional neural network and is therefore embedded in the ID-Block convolutional branch. The size of the 2D pooling is typically set to 3 x 3, so the pooling size in the ID-Block proposed by the present invention is 3. The four convolution branches are then finally spliced together with the input as output. The ID-1D-CNN of the present invention uses a densely connected approach to solve the "gradient vanishing" problem in 1D convolution.

In the ID-Block proposed by the present invention, the final total number of convolution kernels (channels) for the four convolution branches is denoted as k. The ratio of the number of channels at the end of each convolution branch in ID-Block is i/16, (8-i)/16,1/4 and 1/4, respectively, as shown in FIG. 3, where i is a parameter that controls the ratio of the number of convolution kernels between 3 and 5. In other words, the number of convolution kernels with convolution kernel size 1 and the convolution branches of the pooling layer is fixed, while the number of convolution kernels of

size

3 and 5 varies with the depth of the ID-Block in the network. The number of large convolution kernels in ID-Block increases with increasing depth of the network, since higher layers of features are more abstract and less spatial concentrated. The values of i corresponding to ID-Block (a), ID-Block (b) and ID-Block (c) are set to 4,6 and 7, respectively. All convolution and pooling operations in ID-Block have a step size of 1 and all use zero padding to keep the same shape as the input so that they can be stitched together. Assuming that the number of input channels is l, the number of parameters in an ID-Block is calculated as

Where PN represents the total number of parameters and the value of each bracket represents the number of parameters per convolution branch in the ID-Block.

Verification example:

when the model is constructed, the human body action data is divided into five folds, and the structure of the model is verified by adopting a five-fold cross verification method to obtain the final model. And then verifying the results of other models by adopting a 5-fold cross verification method, and comparing.

In the experiments below of the present invention, each category of data was divided into 5 parts. The performance of the proposed model was verified using a 5-fold cross-validation method. The overall structure of ID-1D-CNN is shown in FIG. 2. The network input is the raw radar signal. In the first convolution layer of the primary module of the raw data processing process, the convolution kernel has a size of 51 and a step size of 2. After trying convolution kernels of different sizes in the first convolution layer, a convolution kernel of size 51 was found to be optimal. This can be understood as the micro-doppler signal of 51 sample points resembling a "frame" of a speech signal. The size of the zero padding is 25 to maintain the same shape, and the number of channels n is 64. For the second convolutional layer, the convolutional kernel size is 9, and the step size is 2. Zero padding is also used in the second convolutional layer. After that, the largest pooling layer with size 4 and step size 4 is connected. And each time an ID-Block is added, the number of channels of the whole network is increased by k. In the experiments of the present invention, k was set to 64. Each ID-Block is followed by a maximum pooling layer of size 4 with a step size of 4. At the end of ID-1D-CNN are two fully connected layers. The number of hidden neurons in the first fully connected layer was set to 160. The random deactivation technique was applied between the last maximum pooling layer and the first fully-connected layer with a probability of 0.6. The modified Linear Unit (ReLU) function is used as the activation function in the first fully-connected layer, which uses the normalized exponential function (softmax) as the activation function for the final classification. The arctan function (arctan) acts as an activation function for the other layers.

The performance of the proposed ID-1D-CNN was evaluated by four indices, including the average accuracy of the 5-fold cross-validation results, the number of hyper-parameters, the average training time of one epoch and the test time of a single sample without back propagation. The proposed model uses a block of NVIDIA GeForce GTX1080Ti graphics card (GPU, 12GB memory) and 2.5GHz Inter (R) Xeon (R) CPU E5-2678 v3, and is trained on a server with 64G memory. The model was trained in a Keras frame at the rear end of the tensrflow. An Adam (Adaptive motion optimization) optimizer was used in the training process, the batch size was 64, and the initial learning rate was 0.001. A learning rate attenuation strategy is used in the training process, when 20 epochs are iterated, the accuracy rate of a verification set is not improved, the learning rate is 1/2 of the original learning rate, and the minimum learning rate is 0.00001. The training process also uses the early stop mechanism for detecting whether the model fits. And when the accuracy of the verification set is not improved after iterating for 50 epochs, considering that the model is fitted, and stopping training. The initialization method of the network parameters is Glorot uniform distribution initialization.

To demonstrate the superiority of this protocol, the method of the present invention was compared with the methods in recently published documents [1-6 ]. In document [2], an SVM method using a gaussian kernel is first proposed to classify human activities. In document [1], the ANN has only one hidden layer, of which there are 34 neurons. In document [3], a convolutional neural network consisting of 3 convolutional layers and 2 fully-connected layers is proposed. Document [4] proposes a deeper CNN consisting of 5 convolutional layers. In document [5], the classical network ResNet-18 consists of 18 residual modules for classifying the spectral images of six human activities. The convolutional auto-encoder used in document [6] consists of 3 convolutional layers and 3 anti-convolutional layers, and is used for the more challenging classification problem of 12 types of actions. In addition, inceptionResNet (I-ResNet) [7] was added to the comparison.

TABLE 2 results of 5-fold cross-validation of different models

TABLE 3 parameters, training times and Single sample test times for different models

/>

The results of comparative experiments for each model are shown in tables 2 and 3. For the conventional methods (MLP and SVM), their accuracy is relatively low. The reason is because the envelope of the experimentally obtained spectrogram (manually extracted features) of the present invention is not as apparent as proposed in [1,2 ]. This means that the method of manually extracting features has significant limitations. For the deep learning method (2D), their accuracy was between 93.95% and 95.71%. The accuracy of ID-1D-CNN is 0.39% to 2.15% higher than these 2D methods. The ID-1D-CNN model has 365,319 parameters, which reduces the number of parameters by 2.02-148.6 times compared with other deep networks. For an epoch, the training time for ID-1D-CNN averages 5.62 seconds, which is faster than the methods of the other models. More importantly, the test time for a single sample without back propagation of ID-1D-CNN was 0.141 milliseconds and 2.14-29.68 times faster than the other models.

The behavior of CAE and ID-1D-CNN using different amounts of training data is shown in FIG. 4. The performance of both methods improves with increasing training samples. However, the CAE method is superior to the described method when the training data is reduced below 50% of the total training data amount. In other words, the CAE method may achieve better performance than the described method when the number of training samples is limited

In order to further research the performance of the ID-1D-CNN, the two-dimensional CNN (ID-2D-CNN) and STFT-1D-Net which are densely connected in a multi-scale mode are also constructed. ID-2D-CNN has the same architecture as ID-1D-CNN, only the dimensions are expanded. For example, the convolution kernel size in ID-2D-CNN is 3X 3, while the convolution kernel size in ID-1D-CNN is 3. The size and step size of the 4-layer maximum pooling layer of ID-2D-CNN are both 2 x 2. The remaining superparameters are identical to ID-1D-CNN. For STFT-1DNet, the structure and parameters are the same as ID-1D-CNN after removal of the first two convolutional layers and the first pooling layer. The input to STFT-1D-Net is a spectrogram which is viewed as a one-dimensional sequence of vectors.

Table 4 results of cross-validation

TABLE 5 parameters, training times, and Single sample testing times for ID-2D-CNN and STFT-1D-Net

The results are shown in tables 4 and 5 for ID-2D-CNN and STFT-1D-Net. The classification accuracy of ID-1D-CNN is 96.10%, slightly higher than ID-2D-CNN (96.01%). However, the number of superparameters used in ID-1D-CNN is 365,319, which is a 71.55% reduction over ID-2D-CNN. The speed of the training process is also increased by a factor of 5.54, while the test process only involves a factor of 10.2 improvement in forward propagation. The accuracy of STFT-1D-Net is 94.59%, which is 1.51% lower than the proposed ID-1D-CNN method. Thus, replacing the STFT with two trainable convolutional layers may improve performance.

The invention compares the accuracy of different data acquisition times (from 1s to 3s, with the step length of 0.2 s). The results are shown in FIG. 5. The effective detection range of the radar is 5 meters. For running activities, running 5 meters requires less than 3 seconds. The maximum value of the acquisition time is 3 seconds. In general, the longer the data is acquired, the higher the accuracy. The accuracy is highest when the acquisition time is 3 seconds, so the acquisition time is set to 3 seconds in the case of the present invention.

Compared with the prior art, the embodiment of the invention has the following advantages and beneficial effects:

1. the input of the ID-1D-CNN of the invention is the original radar signal, not the spectrogram obtained after the signal is subjected to short-time Fourier transform (STFT). The short-time fourier transform can be viewed as a one-dimensional convolution with fixed fourier coefficients. Thus, in the present invention, the first two convolutional layers of ID-1D-CNN are used in place of the short-time Fourier transform. The representation process of the radar signal features is fused into a neural network, and a better feature representation mode can be learned by optimizing classification errors.

2. The network structure of the present invention is one-dimensional and better fits time-dependent data (radar data). By transforming the radar signal into the time-frequency domain using a short-time fourier transform, a dimensional matrix (spectrogram) can be obtained. Existing deep learning methods treat this two-dimensional matrix as a typical optical image and use 2D-CNN for classification. The pixels of the optical image have a spatial correlation, whereas the pixels of the two-dimensional matrix of the radar have more temporal correlation. Therefore, it may not be optimal to treat the two-dimensional matrix as an optical image and then classify it using 2D-CNN. 1D-CNN is typically used to process time-series signals and therefore has a better ability to extract time-correlation from radar signals. Experimental results prove that the method can obtain better performance than the existing 2D-CNN method.

3. The two points are integrated, the intrinsic characteristics of the radar data are learned by inputting the original radar data, the category of the action data is directly output, and the end-to-end learning advantage of the neural network is fully utilized.

4. In the ID-Block provided by the invention, a multi-volume integral branch technology, a NiN technology and a dense connection technology are fused. They can enhance the representability of features, increase feature reuse, and reduce gradient disappearance.

5. The neural network has the advantages of simple structure, less parameter quantity, higher accuracy and lower complexity, and has the possibility of being embedded into a hardware system.

Reference documents:

[1]Y.Kim and H.Ling,"Human activity classification based on micro-Doppler signatures using an artificial neural network,"in Antennas and Propagation Society International Symposium,2008.AP-S 2008.IEEE,2008,pp.1-4:IEEE.

[2]Y.Kim and H.Ling,"Human activity classification based on micro-Doppler signatures using a support vector machine,"IEEE Transactions on Geoscience and Remote Sensing,vol.47,no.5,pp.1328-1337,2009.

[3]Y.Kim and T.Moon,"Human detection and activity classification based on micro-Doppler signatures using deep convolutional neural networks,"IEEE geoscience and remote sensing letters,vol.13,no.1,pp.8-12,2016.

[4]T.S.Jordan,"Using convolutional neural networks for human activity classification on micro-Doppler radar spectrograms,"in Sensors,and Command,Control,Communications,and Intelligence(C3I)Technologies for Homeland Security,Defense,and Law Enforcement Applications XV,2016,vol.9825,p.982509:International Society for Optics and Photonics.

[5]H.Du,Y.He,and T.Jin,"Transfer Learning for Human Activities Classification Using Micro-Doppler Spectrograms,"in 2018 IEEE International Conference on Computational Electromagnetics(ICCEM),2018,pp.1-3:IEEE.

[6]M.S.

and S.Z.Gürbüz,"Deep neural network initialization methods for micro-Doppler classification with low training sample support,"IEEE Geosci.Remote Sens.Lett,vol.14,no.12,pp.2462-2466,2017.

[7]C.Szegedy,S.Ioffe,V.Vanhoucke,and A.A.Alemi,"Inception-v4,inception-resnet and the impact of residual connections on learning,"in AAAI,2017,vol.4,p.12.

Claims

1. a micro Doppler radar human body action classification method of a convolution neural network comprises an original data processing process and a deep convolution neural network, wherein the output of the original data processing process is connected to the deep convolution neural network; a plurality of multi-scale intensive connection modules with pooling are connected in series in sequence, the input of the original data processing process is original radar data, the input end of the first multi-scale intensive connection module with pooling is connected with the output end of the original data processing process, the output end of the last multi-scale intensive connection module with the pooling is connected with a full connection layer, and the full connection layer outputs the classification label; the original data processing process comprises two one-dimensional convolutional layers and a maximum pooling layer which are sequentially connected in series, wherein the number of channels of the first one-dimensional convolutional layer is n, and the number of channels of the second one-dimensional convolutional layer and the number of channels of the maximum pooling layer are 2n; the first one-dimensional convolutional layer inputs the original radar data, and the output end of the third one-dimensional convolutional layer is connected with the first multi-scale dense connection module with pooling; where n is the initial number of channels.

2. The micro-doppler radar human body motion classification method of a convolutional neural network as claimed in claim 1, wherein the multi-scale dense connection module with pooling comprises a multi-scale dense connection module and a maximum pooling layer which are connected in front and at the back, and the output of the multi-scale dense connection module is connected to the maximum pooling layer; the number of the full-connection layers is two, the two full-connection layers are connected in series, random inactivation is implemented between the maximum value pooling layer of the last multi-scale dense connection module with pooling and the first full-connection layer, and the probability of the random inactivation is 0.6; the modified linear element function is used as the activation function in the first fully-connected layer, and the second fully-connected layer uses the normalized exponential function as the activation function for the final classification.

3. The micro-doppler radar human body action classification method of a convolutional neural network as claimed in claim 2, wherein the multi-scale dense connection module comprises 4 convolutional branches, one through branch and a concatenation layer, the 4 convolutional branches and the one through branch are connected in parallel between the input end of the multi-scale dense connection module and the concatenation layer at the rear part, and the concatenation layer concatenates the four convolutional branches and the input together as the output of the multi-scale dense connection module.

4. The micro-doppler radar human body motion classification method of a convolutional neural network as claimed in claim 3, wherein the first convolutional branch and the second convolutional branch each comprise two one-dimensional convolutional layers connected in series, and the third convolutional branch comprises an average value pooling layer and a one-dimensional convolutional layer connected in series in the front-to-back order; the fourth convolution branch includes a one-dimensional convolution layer.

5. The micro-Doppler radar human body motion classification method of the convolutional neural network of claim 4, wherein the number of the first one-dimensional convolutional layer convolutional kernels of the first convolutional branches is 1, the number of channels is (1/8) k, the number of the second one-dimensional convolutional layer convolutional layers convolutional kernels of the first convolutional branches is 5, and the number of channels is (i/16) k; the number of the first one-dimensional convolutional layer convolutional cores of the second convolutional branches is 1, the number of channels is (1/8) k, the number of the second one-dimensional convolutional layer convolutional cores of the second convolutional branches is 3, and the number of channels is ((8-i /) 16) k; the size of the third convolution branch average pooling layer sample is 3, the number of the third convolution branch one-dimensional convolution layer convolution kernels is 1, and the number of channels is (1/4) k; the number of the one-dimensional convolution layer convolution kernels of the fourth convolution branch is 1, and the number of channels is (1/4) k; where k is the sum of the number of 4 convolution branch channels and i is a parameter controlling the number ratio between convolution kernels of 3 and 5.

6. The micro-Doppler radar human body motion classification method of the convolutional neural network according to claim 5, wherein the deep convolutional neural network comprises a plurality of the pooled multi-scale dense connection modules, in the front-to-back order, the number of channels of the splicing layer of the first pooled multi-scale dense connection module and the number of channels of the maximum pooled layer are 2n + k, and the number of channels of the splicing layer and the number of channels of the maximum pooled layer are increased by k for each subsequent increase of the pooled multi-scale dense connection module; wherein n is the initial number of channels.

7. The micro-doppler radar human body motion classification method of convolutional neural network of claim 6, wherein the deep convolutional neural network comprises three said pooled multi-scale dense connection modules, from front to back, with the value of i set to 4,6 and 7, respectively, in which all the convolution and pooling operations have step size of 1 and all use zero padding to keep the same shape as the input.

8. The classification method of body motion by micro-doppler radar of convolutional neural network as claimed in claim 6, wherein all elements are flattened into a column in the maximum pooling layer of the last multi-scale dense connection module with pooling, and then classified by the full connection layer to output a classification label.

9. The method for classifying human body actions by using micro Doppler radar of a convolutional neural network as claimed in claim 1, wherein in the first convolutional layer of the original data processing process, the size of the convolutional kernel is 51, the step length is 2, and the number of channels n is 64; the size of the second convolutional layer convolution kernel is 9, and the step length is 2; the maximum pooling layer size is 4 and the step size is also 4.