CN111736125B - Radar target identification method based on attention mechanism and bidirectional stacking cyclic neural network - Google Patents

Radar target identification method based on attention mechanism and bidirectional stacking cyclic neural network Download PDF

Info

Publication number
CN111736125B
CN111736125B CN202010256158.0A CN202010256158A CN111736125B CN 111736125 B CN111736125 B CN 111736125B CN 202010256158 A CN202010256158 A CN 202010256158A CN 111736125 B CN111736125 B CN 111736125B
Authority
CN
China
Prior art keywords
hrrp
sample
rnn
layer
radar
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010256158.0A
Other languages
Chinese (zh)
Other versions
CN111736125A (en
Inventor
潘勉
吕帅帅
李训根
刘爱林
李子璇
张�杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202010256158.0A priority Critical patent/CN111736125B/en
Publication of CN111736125A publication Critical patent/CN111736125A/en
Application granted granted Critical
Publication of CN111736125B publication Critical patent/CN111736125B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/02Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00
    • G01S7/41Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
    • G01S7/417Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section involving the use of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Radar Systems Or Details Thereof (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a radar target identification method based on an attention mechanism and a bidirectional stacking cyclic neural network, which comprises the steps of firstly preprocessing to reduce sensitivity in an HRRP sample and establishing a dynamic adjustment layer; then selecting the sliding window size to split the HRRP, wherein the sliding window moving distance is smaller than the sliding window length; then, the importance degree of each segmentation sequence is adjusted through an importance network; modeling the time sequence correlation of the samples through the two-way stacking RNNs, and extracting the high-level characteristics of the samples; finally, a multi-level attention mechanism is adopted to adjust the importance degree of the hidden layer state and target classification is carried out through softmax.

Description

Radar target identification method based on attention mechanism and bidirectional stacking cyclic neural network
Technical Field
The invention belongs to the field of radar target identification, and particularly relates to a radar target identification method based on an attention mechanism and a bidirectional stacking cyclic neural network.
Background
The range resolution of high-resolution wideband radar is much smaller than the target size, and its echoes are also known as one-dimensional high-resolution range profiles (High Resolution Range Profile, HRRP) of the target. The HRRP contains extremely valuable structural information for classification, identification and the like, such as the radial size of a target, scattering point distribution and the like, and has wide engineering application prospect. Therefore, the HRRP-based radar automatic target recognition method gradually becomes a hot spot for research in the radar automatic target recognition field.
For most HRRP target recognition systems, since the original HRRP sample tends to have a high dimension, it is often difficult to directly embody the essential attribute of the recognition object, so feature extraction is a key ring. The main task of feature extraction is to provide some help (such as reducing data dimension, enhancing discrimination information, etc.) for the subsequent recognition task through some linear or nonlinear transformation. The effective characteristics not only can fully express data, but also can distinguish the differences of different categories, thereby improving the recognition precision.
The conventional feature extraction method can be divided into two parts: (1) feature extraction method based on dimension reduction; (2) Transformation (transducer) based feature extraction methods such as bispectrum, spectrogram, spectral amplitude features, and the like. These methods project HRRP signals into the frequency domain and then model and identify their frequency domain features. The traditional feature extraction method obtains good recognition performance in experiments, however, the following two problems exist: (1) The manner in which features are extracted is mostly unsupervised and lossy, meaning that the information of the separability will inevitably lose a part during the feature extraction, which is detrimental to the recognition of the back-end classifier. (2) The choice of feature extraction methods is highly dependent on the accumulation of knowledge and experience of researchers with HRRP data, and is difficult to achieve in some cases without prior information.
In order to solve the problems of the conventional method in feature extraction, in recent years, a deep learning-based method has been introduced into the field of radar target recognition. The radar high-resolution range profile recognition method based on deep learning can be roughly divided into the following three types: (1) A method of deep learning based on encoder-decoder architecture. (2) Deep learning method based on Convolutional Neural Network (CNN) structure. (3) a deep learning method based on a cyclic neural network. However, methods (1) and (2) feature extraction and modeling directly on the envelope information of the HRRP ensemble ignores sequence dependencies between HRRP distance units that may reflect the target physical structural features. The method (3) models based on sequence correlation, and although modeling and describing physical structural features, there are several problems: (1) The range bin with smaller amplitude may contain some very separable features, but these features are rarely used; (2) The unidirectional RNN can only utilize the current time and the structural information before the current time during prediction, and cannot well utilize the prior integral structural information contained in the HRRP.
Disclosure of Invention
In view of the above technical problems, the present invention is to provide a radar target recognition method based on an attention mechanism and a bidirectional stacking cyclic neural network, which includes firstly, preprocessing to reduce sensitivity in an HRRP sample and establishing a dynamic adjustment layer; then selecting the sliding window size to split the HRRP, wherein the sliding window moving distance is smaller than the sliding window length; then, the importance degree of each segmentation sequence is adjusted through an importance network; modeling the time sequence correlation of the samples through the two-way stacking RNNs, and extracting the high-level characteristics of the samples; finally, a multi-level attention mechanism is adopted to adjust the importance degree of the hidden layer state and target classification is carried out through softmax.
In order to solve the technical problems, the invention adopts the following technical scheme:
a radar target identification method based on an attention mechanism and a bidirectional stacking cyclic neural network comprises the following steps:
s1, collecting a data set, combining HRRP data sets collected by a radar according to the types of targets, respectively selecting a training sample and a test sample in different data segments by each type of sample, ensuring that the posture formed by the selected training set sample and the radar covers the posture formed by the test set sample and the radar in the process of selecting the training set sample and the test set sample, wherein the ratio of the number of samples of each type of targets to the number of samples of each type of targets is 8:2, and marking the selected data set as T = { (x) i ,y k )} i∈[1,n],k∈[1,c] Wherein x is i Represents the i-th sample, y k Indicating that the samples belong to the kth class, a class c target is collected altogether, and n indicates the total number of the samples;
s2, preprocessing an original HRRP sample set, wherein the strength of the HRRP comprises radar transmitting power, target distance, radar antenna gain and radar receiver gain factors which are determined together, and before target identification is carried out by using the HRRP, the method comprises the steps of 2 The original HRRP echo is processed by an intensity normalization method, so that the intensity sensitivity problem of the HRRP is improved, the HRRP is intercepted from radar echo data through a distance window, the position of a range image recorded in the intercepting process in a range gate is not fixed, and therefore translation sensitivity of the HRRP is caused, and the HRRP is subjected to gravity center pairThe alignment method improves the translation sensitivity problem of HRRP;
s3, because the amplitude difference of echoes in each distance unit in the HRRP is large, the data is directly sent into the convolution layer, so that the model is excessively focused on the distance unit with large amplitude, however, the distance unit with small amplitude possibly contains some characteristics with strong separability, radar target identification is facilitated, a dynamic adjusting layer is added to carry out integral dynamic range adjustment on the HRRP before the HRRP is segmented, and the adjusting layer can determine how to adjust the integral dynamic of the HRRP through model training on the premise that the relative relation of the sizes of the distance units is not changed, so that a better identification effect is achieved;
s4, selecting a sliding window with a fixed length to segment the HRRP sample processed by the method, wherein the segmented data format is the input format of a subsequent deep neural network;
s5, constructing an importance adjustment network to carry out channel adjustment on the processed data, automatically acquiring the importance degree of each characteristic channel by the importance network in a learning mode, and then improving useful characteristics and inhibiting characteristics with little use for the current task according to the importance degree;
s6, setting up deep nerve classification, adjusting parameters and optimizing, adopting a bidirectional cyclic nerve network, respectively inputting HRRP data into two independent RNN models in a positive and negative direction, and splicing the obtained hidden layers;
s7, performing preprocessing operations of steps S2, S3 and S4 of a training stage on the test data acquired by the step S1;
s8, sending the sample processed in the S7 into the model constructed in the S6 for testing to obtain a result, namely, finally classifying the output through the attention mechanism through a softmax layer, and testing the sample by the ith HRRP
Figure BDA0002437405090000041
The probability corresponding to a class k radar target in the target set may be calculated as:
Figure BDA0002437405090000042
wherein exp (·) represents the exponentiation, and c represents the number of categories.
Preferably, the step S2 further comprises the steps of:
s201, intensity normalization, assuming that the original HRRP is denoted as x raw =[x 1 ,x 2 ,…,x L ]Where L represents the total number of distance units contained within the HRRP, then the HRRP after intensity normalization is expressed as:
Figure BDA0002437405090000043
s202, aligning samples, translating the HRRP so that the center of gravity g of the HRRP moves to the vicinity of L/2, and thus, the distance units containing information in the HRRP are distributed in the vicinity of the center, wherein the calculation method of the center of gravity g of the HRRP is as follows:
Figure BDA0002437405090000044
wherein x is i Is the i-th dimension signal unit in the original HRRP.
Preferably, the S3 further includes: the HRRP sample is dynamically adjusted, namely, the sample is subjected to multiple powers, the data is subjected to the powers, so that the diversity of target class differences is reflected from multiple angles, the information contained in the radar HRRP is reflected in multiple different forms from multiple angles, the characteristics are conveniently extracted from multiple angles by a subsequent network for identification, and the output of a dynamic adjustment layer can be expressed as follows:
Figure BDA0002437405090000045
wherein M is the channel number of the dynamic adjustment layer, and the ith dynamic adjustment channel
Figure BDA0002437405090000046
Can be expressed as:
Figure BDA0002437405090000047
wherein alpha is i The coefficients representing the power transform.
Preferably, the S4 further includes:
s401, carrying out sliding window segmentation on the HRRP sample subjected to dynamic adjustment, setting the length of the sliding window as N, and the sliding distance as d, wherein d is less than N, namely, two adjacent sections of signals after the segmentation have overlapping parts with the length of N-d, the overlapping segmentation is larger, the sequence characteristics in the HRRP sample are reserved, the subsequent deep neural network can learn the characteristics more useful for classification in the sample, the number of times of the segmentation corresponds to the dimension of a time point in the input format of the subsequent deep neural network, and the length N of the sliding window corresponds to the dimension of an input signal of each time point;
s402, the output after the sliding window segmentation may be expressed as:
Figure BDA0002437405090000051
wherein M is the number of sequences after segmentation, and the t-th segmentation sequence is
Figure BDA0002437405090000052
Wherein d is the sliding distance of the window, and N is the sliding window length.
Preferably, the S5 further includes:
s501, importance network carries on importance adjustment to the HRRP after segmentation, through learning the overall information of the convolution channel to emphasize the input sequence of some time points with more separable information selectively and restrain some other less important input sequences of time points, after the importance network adjustment, the model becomes more balanced, make more important, more useful characteristic can be highlighted, has improved model and represented HRRP ability, the importance adjustment is divided into compression characteristic and excitation characteristic two parts;
s502, compressing the feature part: the sample after sliding window cutting is
Figure BDA0002437405090000053
Figure BDA0002437405090000054
The feature is composed of M sequences, each sequence is an N-dimensional vector, each sequence is compressed into a real weight x representing the importance of the sequence through a full connection layer and an activation function sq ,x slide The output through the full connection can be calculated by the following formula:
x sq =f(Wx slide +b)
wherein the activation function f (·) is a Sigmoid function,
Figure BDA0002437405090000055
s503, feature excitation section: selectively adjusting the extracted features through an expression formula to obtain adjusted features F E
F E =x slide ⊙x sq
Wherein x is sq =[x sq (1),x sq (2),…,x sq (M)]It is an M-dimensional vector, +. slide Each element in each channel is multiplied by x sq The number in the corresponding dimension in this vector. As in feature F E The mth channel of (a) is adjusted to:
Figure BDA0002437405090000061
preferably, specifically, the step S6 further includes:
s601, the classification network is designed as a multi-layer stacked bidirectional RNN, assuming that the input is feature F RNN
Figure BDA0002437405090000062
Wherein M is i Each time point dimension representing the ith bidirectional RNN, N representing the input sequence length, assuming its output is F output ,/>
Figure BDA0002437405090000063
Where H is the number of hidden units, where the vector corresponding to the kth time point in the sequence can be expressed as:
Figure BDA0002437405090000064
wherein f (·) represents the activation function,
Figure BDA0002437405090000065
representing a hidden layer output matrix corresponding to a forward RNN included in an ith bidirectional RNN,/and>
Figure BDA0002437405090000066
represents the kth hidden layer state contained in the forward RNN contained in the ith bidirectional RNN, and similarly,/is>
Figure BDA0002437405090000067
Represents a hidden layer output matrix corresponding to a backward RNN included in an ith bidirectional RNN,/v>
Figure BDA0002437405090000068
Represents the kth hidden layer state, b, contained in the backward RNN contained in the ith bidirectional RNN Fi An output layer bias representing the ith bidirectional RNN;
s602, selecting hidden layers obtained by the last two-way RNNs at different moments for splicing according to an attention mechanism in a network, wherein the hidden layer state after the i-th layer is spliced is as follows:
Figure BDA0002437405090000069
finally, adding hidden layers after each layer is spliced to obtain hidden layer c after attention model processing ATT The method comprises the following steps:
Figure BDA0002437405090000071
wherein alpha is ik Representing the weight corresponding to the kth time point of the ith layer, M representing the number of hidden states contained in the forward RNN or the backward RNN of each layer in the bidirectional RNN model, namely the dimension of the time point, N 1 Indicating the number of layers of the network stack, N 0 Representing taking hidden states in several layers stacked bidirectional RNNs for c-solving, starting from the last layer ATT ,α ik The method of (2) is as follows:
Figure BDA0002437405090000072
wherein e ik The energy added for the forward and backward hidden states in the ith bidirectional RNN is expressed as:
e ik =U ATT tanh(W ATT h ik )
wherein the method comprises the steps of
Figure BDA0002437405090000073
They are parameters for calculating the energy of the hidden units, l is the dimension of the hidden units, M is the dimension of the point in time;
s603, designing a loss function as cross entropy, learning parameters by calculating gradients of the loss function relative to the parameters by using training data, and fixing the learned parameters when the model converges, wherein the cost function based on the cross entropy is adopted and expressed as:
Figure BDA0002437405090000074
wherein N represents the number of training samples in a batch, e n Is a one-hot vector forTrue labels representing the nth training sample, P (i|x train ) Representing the probability that the training sample corresponds to the ith target.
The invention has the following beneficial effects:
(1) The dynamic adjustment layer is applied in the embodiment of the invention, and because some better separable characteristics possibly have difficulty in influencing the decision of the subsequent classifier due to the relative amplitude, the overall dynamic state of the HRRP is determined to be adjusted by model training on the premise that the relative relation of the sizes of the distance units is not changed through the dynamic adjustment layer, so that a better recognition effect is achieved.
(2) The embodiment of the invention applies an importance adjustment network, which can selectively emphasize the convolution channels with more separable information and inhibit the less useful convolution channels by learning the global information of the convolution channels. After adjustment, the model becomes more balanced from the spatial channel (convolution channel), so that more important and useful features can be highlighted, and the HRRP representation capability of the model is improved.
(3) The embodiment of the invention is different from the existing model built based on the HRRP structure, and the bidirectional circulating neural network is stacked and used in the embodiment of the invention, so that the model has a certain depth, the model organized in the mode can abstract the structural characteristics of a high layer step by step according to the context of data better, and hidden states in each bidirectional circulating neural network layer contain structural representations of different layers, thereby helping us to better apply the HRRP for recognition.
(4) The embodiment of the invention applies the attention model, and the weight for enhancing the judgment given by the middle signal aggregation area is considered in classification, so that the weight for giving the judgment by the noise areas at two sides is reduced.
Drawings
Fig. 1 is a flowchart of the steps of a radar target recognition method based on an attention mechanism and a bi-directional stacked recurrent neural network according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, a flowchart of steps of a radar target recognition method based on an attention mechanism and a bi-directional stacked recurrent neural network according to an embodiment of the present invention is shown, specifically, the method includes the following steps:
s1, collecting a data set, combining HRRP data sets collected by a radar according to the types of targets, respectively selecting a training sample and a test sample in different data segments by each type of sample, ensuring that the posture formed by the selected training set sample and the radar covers the posture formed by the test set sample and the radar in the process of selecting the training set sample and the test set sample, wherein the ratio of the number of samples of each type of targets to the number of samples of each type of targets is 8:2, and marking the selected data set as T = { (x) i ,y k )} i∈[1,n],k∈[1,c] Wherein x is i Represents the i-th sample, y k Indicating that the samples belong to the kth class, a class c target is collected altogether, and n indicates the total number of the samples;
s2, preprocessing an original HRRP sample set, wherein the strength of the HRRP comprises radar transmitting power, target distance, radar antenna gain and radar receiver gain factors which are determined together, and before target identification is carried out by using the HRRP, the method comprises the steps of 2 The original HRRP echo is processed by the intensity normalization method, so that the intensity sensitivity problem of the HRRP is improved, the HRRP is intercepted from radar echo data through a distance window, the position of a distance image recorded in the intercepting process in a distance wave gate is not fixed, and therefore the translation sensitivity of the HRRP is caused, and the translation sensitivity problem of the HRRP is improved through a gravity center alignment method;
specifically, S2 further comprises the steps of:
s201, intensity normalization, assuming that the original HRRP is denoted as x raw =[x 1 ,x 2 ,…,x L ]Where L represents the total number of distance units contained within the HRRP, the HRRP after intensity normalization can be expressed as:
Figure BDA0002437405090000091
s202, aligning samples. Translating the HRRP moves its center of gravity g to around L/2 so that those range cells in the HRRP that contain information will be distributed around the center. The method for calculating the HRRP gravity center g comprises the following steps:
Figure BDA0002437405090000092
wherein x is i Is the i-th dimension signal unit in the original HRRP.
After the original HRRP sample is processed by an intensity normalization and gravity center alignment method, the amplitude is limited to be between 0 and 1, the scale is unified, and the value between 0 and 1 is very favorable for the subsequent neural network processing; HRRP echo signals with right or left distribution are adjusted to near the center point.
S3, because the amplitude difference of echoes in each distance unit in the HRRP is large, the data is directly sent into the convolution layer, so that the model is excessively focused on the distance unit with large amplitude, however, the distance unit with small amplitude possibly contains some characteristics with strong separability, radar target identification is facilitated, a dynamic adjusting layer is added to carry out integral dynamic range adjustment on the HRRP before the HRRP is segmented, and the adjusting layer can determine how to adjust the integral dynamic of the HRRP through model training on the premise that the relative relation of the sizes of the distance units is not changed, so that a better identification effect is achieved;
the step S3 further includes dynamically adjusting the HRRP sample, that is, performing a plurality of powers on the sample, performing a power processing on the data, so as to reflect diversity of target class differences from a plurality of angles, and reflecting information contained in the radar HRRP from a plurality of angles in a plurality of different forms, so that a subsequent network can conveniently extract features from a plurality of angles to identify, and the output of the dynamic adjustment layer can be expressed as:
Figure BDA0002437405090000101
wherein M is the channel number of the dynamic adjustment layer, and the ith dynamic adjustment channel
Figure BDA0002437405090000102
Can be expressed as
Figure BDA0002437405090000103
Wherein alpha is i The coefficients representing the power transform.
S4, selecting a sliding window with a fixed length to segment the HRRP sample processed by the method, wherein the segmented data format is the input format of a subsequent deep neural network;
the S4 further includes:
s401, carrying out sliding window segmentation on the HRRP sample subjected to dynamic adjustment, setting the length of the sliding window as N, and the sliding distance as d, wherein d is less than N, namely, two adjacent sections of signals after the segmentation have overlapping parts with the length of N-d, the overlapping segmentation is larger, the sequence characteristics in the HRRP sample are reserved, the subsequent deep neural network can learn the characteristics more useful for classification in the sample, the number of times of the segmentation corresponds to the dimension of a time point in the input format of the subsequent deep neural network, and the length N of the sliding window corresponds to the dimension of an input signal of each time point;
s402, the output after the sliding window segmentation may be expressed as:
Figure BDA0002437405090000104
wherein M is the number of sequences after segmentation, and the t-th segmentation sequence is
Figure BDA0002437405090000105
Wherein d is the sliding distance of the window, and N is the sliding window length.
S5, constructing an importance adjustment network to carry out channel adjustment on the processed data, automatically acquiring the importance degree of each characteristic channel by the importance network in a learning mode, and then improving useful characteristics and inhibiting characteristics with little use for the current task according to the importance degree;
specifically, the S5 further includes:
s501, importance network carries on importance adjustment to the HRRP after segmentation, through learning the overall information of the convolution channel to emphasize the input sequence of some time points with more separable information selectively and restrain some other less important input sequences of time points, after the importance network adjustment, the model becomes more balanced, make more important, more useful characteristic can be highlighted, has improved model and represented HRRP ability, the importance adjustment is divided into compression characteristic and excitation characteristic two parts;
s502, compressing the feature part: the sample after sliding window cutting is
Figure BDA0002437405090000111
Figure BDA0002437405090000112
The feature is composed of M sequences, each sequence is an N-dimensional vector, each sequence is compressed into a real weight x representing the importance of the sequence through a full connection layer and an activation function sq ,x slide The output through the full connection can be calculated by the following formula:
x sq =f(Wx slide +b)
wherein the activation function f (·) is a Sigmoid function,
Figure BDA0002437405090000113
s503, feature excitation section: selectively adjusting the extracted features through an expression formula to obtain adjusted features F E
F E =x slide ⊙x sq
Wherein x is sq =[x sq (1),x sq (2),…,x sq (M)]It is an M-dimensional vector, +. slide Each element in each channel is multiplied by x sq The number in the corresponding dimension in this vector. As in feature F E The mth channel of (a) is adjusted to:
Figure BDA0002437405090000114
s6, deep nerve classification is built, parameter adjustment is carried out, optimization is carried out, a bidirectional cyclic neural network is adopted, HRRP data are respectively input into two independent RNN models in a positive and negative direction, and the obtained hidden layers are spliced.
The traditional RNN model is unidirectional, HRRP data can only be input along one direction when being input into the traditional model, so that the input at the current moment only has conditional dependence with the input data before the HRRP data, and the input information at the later moment cannot be effectively applied at the current moment. However, the HRRP contains the physical structure prior of the whole object, and only one-way information is considered to be unfavorable for modeling and identifying the HRRP characteristics. In particular, when the unidirectional RNN is applied, most of the observed data information is noise data when the time t is small, and it is difficult for the RNN to accurately model the target structural characteristics. Therefore, the embodiment of the invention adopts a bidirectional cyclic neural network, which inputs HRRP data into two independent RNN models respectively in a positive and negative direction and splices the obtained hidden layers, thus improving the defects of unidirectional RNNs and modeling the physical structural characteristics contained in the HRRP better. The embodiment of the invention uses a stacked bidirectional circulating neural network to enable the model to have a certain depth. The model organized in the mode can better abstract the structural characteristics of a high layer step by step depending on the context of data, and hidden states in each bidirectional circulating neural network layer contain structural representations of different layers, so that the model can help to better apply HRRP for identification. And applying the attention model on the basis, namely considering the weight for enhancing the judgment given by the middle signal aggregation area in classification, and reducing the weight for giving the judgment to the noise areas at two sides. Namely, the deep neural network model in the embodiment of the invention is formed by stacking five layers of bidirectional LSTM (long short time memory network) with attention mechanisms, and finally, the softmax layer is adopted to classify the output of the network.
In particular, the method comprises the steps of, the S6 further includes:
s601, assume that the input is feature F RNN
Figure BDA0002437405090000121
Wherein M is i Each time point dimension representing the ith bidirectional RNN, N representing the input sequence length, assuming its output is F output
Figure BDA0002437405090000122
Where H is the number of hidden units, where the vector corresponding to the kth time point in the sequence can be expressed as:
Figure BDA0002437405090000123
wherein f (·) represents the activation function,
Figure BDA0002437405090000124
representing a hidden layer output matrix corresponding to a forward RNN included in an ith bidirectional RNN,/and>
Figure BDA0002437405090000131
represents the kth hidden layer state contained in the forward RNN contained in the ith bidirectional RNN, and similarly,/is>
Figure BDA0002437405090000132
Represents a hidden layer output matrix corresponding to a backward RNN included in an ith bidirectional RNN,/v>
Figure BDA0002437405090000133
Represents the kth hidden layer state, b, contained in the backward RNN contained in the ith bidirectional RNN Fi Representing the output layer bias of the ith bidirectional RNN.
S602, selecting hidden layers obtained by the last two-way RNNs at different moments to splice, wherein the hidden layer state after the ith layer is spliced is as follows:
Figure BDA0002437405090000134
finally, adding hidden layers after each layer is spliced to obtain hidden layer c after attention model processing ATT The method comprises the following steps:
Figure BDA0002437405090000135
wherein alpha is ik Representing the weight corresponding to the kth time point of the ith layer, M representing the number of hidden states contained in the forward RNN or the backward RNN of each layer in the bidirectional RNN model, namely the dimension of the time point, N 1 Indicating the number of layers of the network stack, N 0 Representing taking hidden states in several layers stacked bidirectional RNNs for c-solving, starting from the last layer ATT 。α ik The method of (2) is as follows:
Figure BDA0002437405090000136
wherein e ik The energy added for the forward and backward hidden states in the ith bidirectional RNN can be expressed as:
e ik =U ATT tanh(W ATT h ik )
wherein the method comprises the steps of
Figure BDA0002437405090000137
They are parameters for calculating the energy of the hidden units, l is the dimension of the hidden units, and M is the dimension of the point in time.
S603, performing splicing operation on the output of the attention mechanism, and then connecting a full-connection layer with the node number being the radar class number, namely, the output of the full-connection layer is a prediction result of a model, and the output can be expressed as:
output=f(C(c ATT )W o )
wherein C (& gt) is a splicing operation,
Figure BDA0002437405090000141
c represents the number of categories and f (·) represents the softmax function.
S604, designing a loss function as cross entropy. The parameters are learned by calculating gradients of the loss function relative to the parameters using the training data, and the learned parameters are fixed as the model converges. The invention adopts a cost function based on cross entropy, and can be expressed as follows:
Figure BDA0002437405090000142
wherein N represents the number of training samples in a batch, e n Is one-hot vector representing the true label of the nth training sample, P (i|x train ) Representing the probability that the training sample corresponds to the ith target.
S605, initializing all weights and biases to be trained in the model, setting training parameters including learning rate, training data amount of each batch, training batch, and starting model training.
S7, performing preprocessing operations of steps S2, S3 and S4 of a training stage on the test data acquired by the step S1;
s8, sending the sample processed in the S7 into the model constructed in the S6 for testing to obtain a result, namely, finally classifying the output through the attention mechanism through a softmax layer, and testing the sample by the ith HRRP
Figure BDA0002437405090000143
The probability corresponding to a class k radar target in the target set may be calculated as:
Figure BDA0002437405090000144
wherein exp (·) represents the exponentiation, and c represents the number of categories.
Test HRRP sample x by maximum posterior probability test K classified to maximum target probability 0 In (a):
Figure BDA0002437405090000145
through the 8 steps, the radar target recognition model based on the attention mechanism and the bidirectional stacking cyclic neural network provided by the invention can be obtained.
It should be understood that the exemplary embodiments described herein are illustrative and not limiting. Although one or more embodiments of the present invention have been described with reference to the accompanying drawings, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.

Claims (5)

1. The radar target identification method based on the attention mechanism and the bidirectional stacking cyclic neural network is characterized by comprising the following steps of:
s1, collecting a data set, combining HRRP data sets collected by a radar according to the types of targets, respectively selecting a training sample and a test sample in different data segments by each type of samples, ensuring that the posture formed by the selected training set sample and the radar covers the posture formed by the test set sample and the radar in the process of selecting the training set sample and the test set sample, wherein the ratio of the numbers of the various target training sets to the test set sample is 8:2, and marking the selected data set as
Figure FDA0004124155330000011
Wherein x is i1 Represents the i1 st sample, y k1 Indicating that the samples belong to the k1 class, a c class target is collected altogether, and n1 indicates the total number of samples;
s2, preprocessing an original HRRP sample set, wherein the strength of the HRRP comprises radar transmitting power, target distance, radar antenna gain and radar receiver gain factors which are determined together, and before target identification is carried out by using the HRRP, the method comprises the steps of 2 The original HRRP echo is processed by an intensity normalization method, so that the intensity sensitivity problem of the HRRP is improved; HRRP is intercepted from radar echo data through a distance window, and the position of a distance image recorded in the intercepting process is not fixed in a distance wave gate, so that the translation sensitivity of the HRRP is caused, and the translation sensitivity problem of the HRRP is improved through a gravity center alignment method;
s3, because the amplitude difference of echoes in each distance unit in the HRRP is large, the data is directly sent into the convolution layer, so that the model is excessively focused on the distance units with large amplitude, the distance units with small amplitude contain some characteristics with strong separability, radar target identification is facilitated, a dynamic adjustment layer is added to carry out integral dynamic range adjustment on the HRRP before the HRRP is segmented, and the adjustment layer is trained through the model on the premise that the relative relation of the sizes of the distance units is not changed so as to determine how to adjust the integral dynamic of the HRRP, so that a better identification effect is achieved;
s4, selecting a sliding window with a fixed length to segment the HRRP sample processed by the method, wherein the segmented data format is the input format of a subsequent deep neural network;
s5, constructing an importance adjustment network to carry out channel adjustment on the processed data, automatically acquiring the importance degree of each characteristic channel by the importance network in a learning mode, and then improving useful characteristics according to the importance degree and inhibiting the characteristics with little use for the current task;
s6, deep nerve classification is built, parameter adjustment is carried out, optimization is carried out, a bidirectional cyclic neural network is adopted, HRRP data are respectively input into two independent RNN models in a positive and negative direction, and the obtained hidden layers are spliced;
the S6 further includes:
s601, the classification network is designed as a multi-layer stacked bidirectional RNN, assuming that the input is feature F RNN
Figure FDA0004124155330000021
Wherein M is i2 Each time point dimension representing the i2 nd bi-directional RNN, N2 representing the input sequence length, assuming its output as F output
Figure FDA0004124155330000022
Wherein H is the number of hidden units, and the vector corresponding to the kth time point in the sequence is expressed as:
Figure FDA0004124155330000023
wherein f (·) represents the activation function,
Figure FDA0004124155330000024
representing hidden layer output matrix corresponding to forward RNNs included in the i2 nd bi-directional RNN,/v>
Figure FDA0004124155330000025
Represents the kth 2 hidden layer state contained in the forward RNN contained in the ith 2 bidirectional RNN, and similarly,/th hidden layer state contained in the forward RNN>
Figure FDA0004124155330000026
Represents a hidden layer output matrix corresponding to a backward RNN included in the i2 nd bi-directional RNN,/for>
Figure FDA0004124155330000027
Represents the kth 2 hidden layer state, b, contained in the backward RNN contained in the ith 2 bidirectional RNN Fi2 Output layer bias representing the i2 nd bidirectional RNNPlacing;
s602, selecting hidden layers obtained by the last two-way RNNs at different moments for splicing according to an attention mechanism in a network, wherein the hidden layer state after the i-th layer is spliced is as follows:
Figure FDA0004124155330000028
finally, adding hidden layers after each layer is spliced to obtain hidden layer c after attention model processing ATT The method comprises the following steps:
Figure FDA0004124155330000031
wherein alpha is ik Represents the weight corresponding to the kth time point of the ith layer, M 1 Representing the number of hidden states, i.e., the point dimension, N, contained by either the forward RNN or the backward RNN of each layer in the bidirectional RNN model 1 Indicating the number of layers of the network stack, N 0 Representing taking hidden states in several layers stacked bidirectional RNNs for c-solving, starting from the last layer ATT ,α i3k3 The method of (2) is as follows:
Figure FDA0004124155330000032
wherein e i3k3 The added energy for the forward and backward hidden states in the i3 rd bi-directional RNN, expressed as:
e i3k3 =U ATT tanh(W ATT h i3k3 )
wherein the method comprises the steps of
Figure FDA0004124155330000033
They are parameters for calculating the energy of the hidden units, l is the dimension of the hidden units, M 1 Is the point-in-time dimension;
s603, designing a loss function as cross entropy, learning parameters by calculating gradients of the loss function relative to the parameters by using training data, and fixing the learned parameters when the model converges, wherein the cost function based on the cross entropy is adopted and expressed as:
Figure FDA0004124155330000034
wherein N is 3 Representing the number of training samples in a batch,
Figure FDA0004124155330000035
is one-hot vector representing the true label of the nth training sample, P (i 3 |x train ) Indicating that the training sample corresponds to the ith 3 Probability of individual targets;
s7, performing preprocessing operations of steps S2, S3 and S4 of a training stage on the test data acquired by the step S1;
s8, sending the sample processed in S7 into the model constructed in S6 for testing to obtain a result, namely, finally classifying the output through the attention mechanism by a softmax layer, i 13 Individual HRRP test samples
Figure FDA0004124155330000041
Corresponds to the kth in the target set 4 The probability of radar-like targets is calculated as:
Figure FDA0004124155330000042
wherein exp (·) represents the exponentiation, and c represents the number of categories.
2. The method for radar target identification based on an attention mechanism and a bi-directional stacked recurrent neural network as claimed in claim 1, wherein said S2 further comprises the steps of:
s201, intensity normalization, assuming that the original HRRP is denoted as x raw =[x 1 ,x 2 ,…,x i ,…,x L ]Where L represents the total number of distance units contained within the HRRP, then the HRRP after intensity normalization is expressed as:
Figure FDA0004124155330000043
s202, aligning samples, translating the HRRP so that the center of gravity g of the HRRP moves to the vicinity of L/2, and thus, the distance units containing information in the HRRP are distributed in the vicinity of the center, wherein the calculation method of the center of gravity g of the HRRP is as follows:
Figure FDA0004124155330000044
wherein,,
Figure FDA0004124155330000046
is the ith in the original HRRP 4 And a dimension signal unit.
3. The method for radar target identification based on an attention mechanism and a bi-directional stacked recurrent neural network of claim 1, wherein S3 further comprises: the HRRP sample is dynamically adjusted, namely the sample is subjected to multiple power processing, the data is subjected to power processing, so that diversity of target class differences is reflected from multiple angles, information contained in the radar HRRP is reflected in multiple different forms from multiple angles, characteristics are conveniently extracted from multiple angles by a subsequent network for identification, and the output of a dynamic adjustment layer is expressed as follows:
Figure FDA0004124155330000045
wherein M is 5 Is the channel number of the dynamic adjustment layer, ith 5 Dynamic adjustment channels
Figure FDA0004124155330000051
Expressed as:
Figure FDA0004124155330000052
wherein,,
Figure FDA0004124155330000055
the coefficients representing the power transform.
4. The method for radar target identification based on an attention mechanism and a bi-directional stacked recurrent neural network as claimed in claim 3, wherein said S4 further comprises:
s401, sliding window segmentation is carried out on the HRRP sample after dynamic adjustment, and the length of the sliding window is set to be N 4 The sliding distance is d, wherein d < N 4 I.e. adjacent two signals after cutting have a length N 4 The overlapping part of d is overlapped and segmented to be larger, the sequence characteristics in the HRRP sample are reserved, the subsequent deep neural network can learn the characteristics which are more useful for classification in the sample to be larger, wherein the number of the segments corresponds to the dimension of the time point in the input format of the subsequent deep neural network, and the length N of the sliding window 4 Corresponding to the input signal dimension at each time point;
s402, the output after the sliding window segmentation is expressed as:
Figure FDA0004124155330000053
wherein M is 6 The number of sequences after segmentation is that the t-th segmentation sequence is
Figure FDA0004124155330000054
Wherein d is window sliding distance, N 4 Is the sliding window length.
5. The method for radar target identification based on an attention mechanism and a bi-directional stacked recurrent neural network of claim 4, wherein S5 further comprises:
s501, importance network carries on importance adjustment to the HRRP after segmentation, through learning the overall information of the convolution channel to emphasize the input sequence of some time points with more separable information selectively and restrain some other less important input sequences of time points, after the importance network adjustment, the model becomes more balanced, make more important, more useful characteristic can be highlighted, has improved model and represented HRRP ability, the importance adjustment is divided into compression characteristic and excitation characteristic two parts;
s502, compressing the feature part: the sample after sliding window cutting is
Figure FDA0004124155330000061
The characteristic is composed of M 6 Each sequence is formed by N 6 Vector of dimensions, each sequence of which is compressed by the full-join layer and the activation function into a real weight x representing the importance of the sequence sq ,x slide The output through the full connection is calculated from the following formula:
x sq =f(Wx slide +b)
wherein the activation function f (·) is a Sigmoid function,
Figure FDA0004124155330000062
s503, feature excitation section: selectively adjusting the extracted features through an expression formula to obtain adjusted features F E
F E =x slide ⊙x sq
Wherein x is sq =[x sq (1),x sq (2),…,x sq (M 7 )]It is an M 7 The dimension vector, ++indicates that X will be slide Each element in each channel is multiplied by x sq Numbers on corresponding dimensions in this vector, e.g. feature F E The mth channel of (a) is adjusted to:
Figure FDA0004124155330000063
CN202010256158.0A 2020-04-02 2020-04-02 Radar target identification method based on attention mechanism and bidirectional stacking cyclic neural network Active CN111736125B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010256158.0A CN111736125B (en) 2020-04-02 2020-04-02 Radar target identification method based on attention mechanism and bidirectional stacking cyclic neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010256158.0A CN111736125B (en) 2020-04-02 2020-04-02 Radar target identification method based on attention mechanism and bidirectional stacking cyclic neural network

Publications (2)

Publication Number Publication Date
CN111736125A CN111736125A (en) 2020-10-02
CN111736125B true CN111736125B (en) 2023-07-07

Family

ID=72646547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010256158.0A Active CN111736125B (en) 2020-04-02 2020-04-02 Radar target identification method based on attention mechanism and bidirectional stacking cyclic neural network

Country Status (1)

Country Link
CN (1) CN111736125B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112764024B (en) * 2020-12-29 2023-06-16 杭州电子科技大学 Radar target identification method based on convolutional neural network and Bert
CN112782660B (en) * 2020-12-29 2023-06-30 杭州电子科技大学 Radar target recognition method based on Bert
CN113238197B (en) * 2020-12-29 2023-07-04 杭州电子科技大学 Radar target identification and judgment method based on Bert and BiLSTM
CN112731309B (en) * 2021-01-06 2022-09-02 哈尔滨工程大学 Active interference identification method based on bilinear efficient neural network
CN112986941B (en) * 2021-02-08 2022-03-04 天津大学 Radar target micro-motion feature extraction method
CN113486917B (en) * 2021-05-17 2023-06-02 西安电子科技大学 Radar HRRP small sample target recognition method based on metric learning
CN114509736B (en) * 2022-01-19 2023-08-15 电子科技大学 Radar target identification method based on ultra-wide band electromagnetic scattering characteristics

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170262996A1 (en) * 2016-03-11 2017-09-14 Qualcomm Incorporated Action localization in sequential data with attention proposals from a recurrent network
CN109086700B (en) * 2018-07-20 2021-08-13 杭州电子科技大学 Radar one-dimensional range profile target identification method based on deep convolutional neural network
CN109214452B (en) * 2018-08-29 2020-06-23 杭州电子科技大学 HRRP target identification method based on attention depth bidirectional cyclic neural network
CN110334741B (en) * 2019-06-06 2023-03-31 西安电子科技大学 Radar one-dimensional range profile identification method based on cyclic neural network
CN110418210B (en) * 2019-07-12 2021-09-10 东南大学 Video description generation method based on bidirectional cyclic neural network and depth output

Also Published As

Publication number Publication date
CN111736125A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
CN111736125B (en) Radar target identification method based on attention mechanism and bidirectional stacking cyclic neural network
CN109214452B (en) HRRP target identification method based on attention depth bidirectional cyclic neural network
CN112364779B (en) Underwater sound target identification method based on signal processing and deep-shallow network multi-model fusion
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
CN110334741B (en) Radar one-dimensional range profile identification method based on cyclic neural network
CN110045015B (en) Concrete structure internal defect detection method based on deep learning
CN111596276B (en) Radar HRRP target identification method based on spectrogram transformation and attention mechanism circulating neural network
CN106951915B (en) One-dimensional range profile multi-classifier fusion recognition method based on category confidence
EP3690741A2 (en) Method for automatically evaluating labeling reliability of training images for use in deep learning network to analyze images, and reliability-evaluating device using the same
CN114332649B (en) Cross-scene remote sensing image depth countermeasure migration method based on double-channel attention
CN112764024B (en) Radar target identification method based on convolutional neural network and Bert
Lin et al. Detection of gravitational waves using Bayesian neural networks
CN111596292B (en) Radar target identification method based on importance network and bidirectional stacking cyclic neural network
CN110766084B (en) Small sample SAR target identification method based on CAE and HL-CNN
CN113378796B (en) Cervical cell full-section classification method based on context modeling
CN111580097A (en) Radar target identification method based on single-layer bidirectional cyclic neural network
CN109239670B (en) Radar HRRP (high resolution ratio) identification method based on structure embedding and deep neural network
CN109948722B (en) Method for identifying space target
CN111580058A (en) Radar HRRP target identification method based on multi-scale convolution neural network
CN115438708A (en) Classification and identification method based on convolutional neural network and multi-mode fusion
CN117131436A (en) Radiation source individual identification method oriented to open environment
CN116106880B (en) Underwater sound source ranging method and device based on attention mechanism and multi-scale fusion
CN116030304A (en) Cross-domain remote sensing image migration resisting method based on weighted discrimination and multiple classifiers
CN116486183A (en) SAR image building area classification method based on multiple attention weight fusion characteristics
CN115329821A (en) Ship noise identification method based on pairing coding network and comparison learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant