CN113625227B

CN113625227B - Attention transformation network-based radar high-resolution range profile target recognition method

Info

Publication number: CN113625227B
Application number: CN202110757184.6A
Authority: CN
Inventors: 白雪茹; 赵晨; 杨敏佳; 周峰
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2021-07-05
Filing date: 2021-07-05
Publication date: 2023-07-04
Anticipated expiration: 2041-07-05
Also published as: CN113625227A

Abstract

The invention discloses a radar high-resolution range profile target recognition method based on an attention transformation network, which mainly solves the problems that local details of a radar high-resolution range profile are difficult to focus on a target area with better separability in the radar high-resolution range profile, global time sequence information is difficult to use, recognition accuracy is low, recognition performance is limited and the like when the radar high-resolution range profile is recognized in the prior art. The implementation steps are as follows: (1) generating a training set; (2) constructing an attention transformation network; (3) training an attention-transforming network; (4) And identifying the radar high-resolution range profile target to be classified. The method and the device can be used for distinguishing the importance of different distance units of the high-resolution range profile by utilizing the local detail characteristics and the global time sequence information of the high-resolution range profile, so that the recognition performance of the high-resolution range profile is effectively improved.

Description

Attention transformation network-based radar high-resolution range profile target recognition method

Technical Field

The invention belongs to the technical field of radars, and further relates to a radar high-resolution range profile target recognition method based on an attention transformation network in the technical field of radar target recognition. Aiming at the radar high-resolution range profile, the invention provides a concentration transformation network structure which can be used for effectively identifying the radar high-resolution range profile.

Background

The High Resolution Range Profile (HRRP) is a vector sum of target scattering point sub-echoes obtained by using a broadband radar signal projected on a radar view line, and can provide a distribution condition of radar scattering cross-sectional areas of target scatterers (such as a nose, a fuselage and the like of an aircraft) along the radar view line under a certain radar view angle. Compared with the synthetic aperture radar image and the inverse synthetic aperture radar image, the method has the advantages of easy acquisition and simple processing, so that the radar target recognition technology based on the high-resolution range profile becomes one of important means for radar real-time target recognition. When the high-resolution radar continuously observes the target, a high-resolution range profile sequence of the target can be obtained, and the sequence contains important characteristics such as the shape, the structure, the scattering intensity, the change rule of the scattering intensity along with the radar visual angle and the like of the target. The existing high-resolution range profile target recognition method based on deep learning is mainly improved on the basis of a convolutional neural network or a long-term and short-term memory network, and a complex manual design process is effectively avoided. However, the existing long-short-term memory unit for learning the time sequence information of the high-resolution range profile has insufficient attention to the local details of the target, and the existing method does not consider the different importance of hidden layers mapped by different range units in the high-resolution range profile on the target identification, so that the identification accuracy is low.

Zhequan Fu, li Xiangping Li, bo Dan, xukun Wang in its published paper "ANeural Network with Convolutional Module and Residual Structure for Radar T arget Recognition Based on High-Resolution Range Profile" (Sensors, 21january 2020) proposes a radar high resolution range profile object recognition method based on convolutional residual neural networks. The method comprises the following implementation steps: (1) Expanding the radar high-resolution range profile data set with normalized amplitude by utilizing a one-dimensional translation interception technology, and dividing the expanded sample into a training sample and a test sample; (2) Taking a depth convolution residual network formed by cascading a plurality of convolution blocks and residual blocks as a learner, taking an edge center loss function as a loss function, training the extended training sample, and obtaining a trained learner; (3) And (5) carrying out radar high-resolution range profile automatic target recognition by using the trained learner. According to the method, the deep neural network is utilized to automatically extract target multilayer features, the edge center loss function is used for improving feature separability, and the residual structure is used for effectively improving the recognition precision of the deep network. However, the method still has the defect that the network training efficiency is low and the recognition accuracy is low because the method ignores the global time sequence information between the high-resolution range profiles.

The Western-type electronic technology university discloses a high-resolution range profile target recognition method based on a gesture-adaptive convolutional network in a patent document (publication number: CN202110032890.4, application publication number: CN 112835008A) applied by the Western-type electronic technology university. The method comprises the following implementation steps: (1) constructing a gesture self-adaptive convolution network; (2) generating a training data set and an auxiliary data set; (3) preprocessing the training data set; (4) generating an adaptive convolution kernel; (5) training a gesture adaptive convolutional network; (6) target recognition. According to the method, the gesture self-adaptive convolution network is constructed, and the target high-resolution range profile echo and target gesture angle information training network is utilized, so that the problem of gesture sensitivity of the high-resolution range profile echo can be effectively solved. However, the method still has the defect that hidden layers mapped by different distance units in the high-resolution range profile have different importance for target identification, and the method does not distinguish the importance degree, so that the method cannot focus on a target area with higher separability in the high-resolution range profile during identification, and the identification performance of an algorithm is limited.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, provides a radar high-resolution range profile target recognition method based on an attention conversion network, and aims to solve the problems that the time sequence correlation between high-resolution range profiles is difficult to utilize, the network training efficiency is low and the importance of different areas in the high-resolution range profile is difficult to distinguish when the radar high-resolution range profile is recognized in the prior art.

The technical idea for realizing the purpose of the invention is as follows: according to the invention, the attention conversion network is constructed, the high-resolution range profile of the radar is directly identified through the trained attention conversion network, and the problems that the network training efficiency is low, the network cannot focus on a target area with better separability in the high-resolution range profile and the identification accuracy is low when the radar high-resolution range profile is identified in the prior art are avoided. The invention utilizes the convolution attention module formed by cascading the convolution sub-network and the attention enhancement convolution sub-network to carry out fine feature extraction on the high-resolution range profile, thereby avoiding the problem that the local feature details of the high-resolution range profile are difficult to pay attention to in the prior art. The feature extracted by the convolution attention module is subjected to position coding through the position coding module, so that the establishment of global time sequence information is realized, and the problem that the high-resolution range profile global time sequence information is difficult to utilize in the prior art is solved. The multi-head attention transform encoder module is used for distinguishing and learning the importance degree of the position-coded features, so that the problem that the prior art is difficult to focus on a target area with better separability in a high-resolution range profile is solved.

The specific steps of the invention are as follows:

step 1, generating a training set:

(1a) Selecting 147950 range images from three types of aircraft high-resolution range images obtained by a radar under the conditions of 5520MHz center frequency, 400MHz signal bandwidth and 400Hz pulse repetition frequency to form a sample set;

(1b) Carrying out amplitude normalization and translation alignment pretreatment on each high-resolution range profile in the sample set in sequence;

(1c) Taking each 30 continuous high-resolution range profiles after pretreatment as a group, and sliding a window on the pretreated sample set;

(1d) Forming a training set by 9859 groups of sequence samples obtained by the sliding window;

step 2, constructing an attention transformation network;

(2a) Building a first convolution sub-network consisting of a convolution layer, a batch normalization layer, a nonlinear activation layer and a maximum pooling layer, and setting sub-network parameters;

(2b) Building a second convolution sub-network consisting of a convolution layer, a batch normalization layer, a nonlinear activation layer and a maximum value pooling layer and setting sub-network parameters;

(2c) Constructing a channel attention layer consisting of a global average attention pooling layer, a first convolution layer, a global maximum attention pooling layer, a second convolution layer and a nonlinear activation layer; the output dimensions of the global average attention pooling layer and the global maximum attention pooling layer are 1 multiplied by 1, the first convolution layer is provided with 4 convolution kernels with the kernel size of 3 multiplied by 3 pixels, the second convolution layer is provided with 32 convolution kernels with the kernel size of 3 multiplied by 3 pixels, and the nonlinear activation layer adopts a linear rectification unit activation function;

(2d) Building a space attention layer consisting of an average attention pooling layer, a maximum attention pooling layer, a convolution layer and a nonlinear activation layer; the number of output channels of the average attention pooling layer and the maximum attention pooling layer is 1, the convolution layer is provided with 1 convolution kernel with the kernel size of 3 multiplied by 3 pixels, and the nonlinear activation layer adopts a linear rectification unit activation function;

(2e) Constructing an attention enhancement convolution sub-network consisting of a convolution layer, a channel attention layer, a space attention layer, a batch normalization layer, a nonlinear activation layer and a maximum value pooling layer; the convolution layer is provided with 32 convolution kernels with the kernel size of 3 multiplied by 3 pixels, the number of the convolved fillers is 1, the number of channels of the batch normalization layer is 32, the nonlinear activation layer adopts a linear rectification unit activation function, the window size of the maximum value pooling layer is 2 multiplied by 2 pixels, and the step length is 2 pixels;

(2f) Cascading the first convolution sub-network, the second convolution sub-network and the attention-enhancing convolution sub-network into a convolution attention module;

(2g) Building a position coding module consisting of a sine encoder and a cosine encoder; the encoding dimensions of the sine encoder and the cosine encoder are 32, the position index of the sine encoder is all even numbers selected from [0,96], and the position index of the cosine encoder is all odd numbers selected from [0,96 ];

(2h) Constructing a multi-head attention conversion encoder module consisting of multi-head attention groups and multi-layer perceptrons; the multi-head attention group consists of 8 attention heads connected in parallel, each attention head is calculated by a scaling dot product formula through a key value, a query value and a true value, wherein the lengths of the key value, the query value and the true value are 97, and the dimensionality is 32 pixels; the multi-layer sensor consists of a first full-connection layer, a Gaussian error linear unit and a second full-connection layer, wherein the weight dimensions of the first full-connection layer and the second full-connection layer are respectively set to be 32 multiplied by 128 and 128 multiplied by 32;

(2i) Concatenating the convolution attention module, the position coding module and the multi-head attention transformation coder module into an attention transformation network;

step 3, training an attention transformation network:

inputting the training set into an attention transformation network, calculating a cross entropy loss value between the output of the network and the class labels of the training image by using a cross entropy loss function, and iteratively updating parameters of the network by using a back propagation algorithm until the cross entropy loss function of the network converges to obtain the trained attention transformation network;

step 4, identifying a radar high-resolution range profile target:

and (3) preprocessing and sliding window processing are carried out on the radar high-resolution range profile to be identified by the same method as in the steps (1 b) and (1 c), a sample set to be identified consisting of 22203 groups of sequence samples is obtained, the sample set to be identified is input into a trained attention conversion network, and class labels are output.

Compared with the prior art, the invention has the following advantages:

firstly, the convolution attention module formed by cascade connection of the convolution sub-network and the attention enhancement convolution sub-network is utilized to extract fine local features of the radar high-resolution range profile, so that the radar high-resolution range profile is focused on more useful local details of the radar high-resolution range profile, the problem that the local details of the range profile are difficult to focus on when the time sequence information of the radar high-resolution range profile is learned in the prior art is solved, and the accuracy of recognition of the radar high-resolution range profile is improved.

Secondly, on the basis of fully utilizing the time-varying information of the radar high-resolution range profile, the method and the device perform position coding on the features extracted by the convolution attention module through the position coding module, so that the establishment of global time sequence information is realized, the problem that the global time sequence information of the radar high-resolution range profile is difficult to utilize in the prior art is solved, and the method and the device improve the recognition accuracy of the radar high-resolution range profile.

Thirdly, the multi-head attention transform encoder module is used for carrying out attention encoding on the characteristics after position encoding, distinguishing the importance of different areas in the radar high-resolution range profile, and solving the problem that the prior art is difficult to focus on a target area with better separability in the radar high-resolution range profile, so that the recognition performance of the radar high-resolution range profile is improved.

Drawings

FIG. 1 is a flow chart of the present invention;

fig. 2 is a schematic structural diagram of a backbone network module according to the present invention;

FIG. 3 is a schematic diagram of the structure of the first and second convolution sub-networks in the backbone network module of the present invention;

fig. 4 is a schematic structural diagram of a channel attention layer and a spatial attention layer in a backbone network module according to the present invention.

Detailed Description

The invention is further described below with reference to the drawings and examples.

With reference to fig. 1, specific steps of an implementation of the present invention will be described in detail.

And step 1, generating a training set.

S1, selecting 147950 range images from three types of aircraft high-resolution range images obtained by a radar under the conditions of 5520MHz center frequency, 400MHz signal bandwidth and 400Hz pulse repetition frequency to form a sample set.

S2, carrying out amplitude normalization and translation alignment pretreatment on each high-resolution range profile in the sample set.

S3, sliding a window on the preprocessed sample set by taking every 30 continuous high-resolution range images as a group after preprocessing.

The sliding window comprises the following specific steps:

and step 1, arranging all high-resolution range images of the preprocessed sample set into a row to obtain a sample lumped sequence.

And 2, sliding on the sample lumped sequence with the step length of 15 high-resolution range images by using a rectangular sliding window with the length of 30 high-resolution range images and the width of 1 high-resolution range image, and taking out all high-resolution range image sequences in the sliding window to form a sequence sample after the sliding window.

S4, forming a training set by 9859 groups of sequence samples obtained by sliding the window.

And 2, constructing an attention transformation network.

A backbone network module consisting of 3 modules is built, and the structure of the backbone network module is as follows: the device comprises a convolution attention module, a position coding module and a multi-head attention conversion coder module. The convolution attention module consists of a first convolution sub-network, a second convolution sub-network and an attention-enhancing convolution sub-network cascade; the position coding module consists of a sine coder and a cosine coder; the multi-head attention conversion encoder module is composed of multi-head attention groups and multi-layer perceptron cascade.

The backbone network module constructed in accordance with the present invention is further described with reference to fig. 2.

The input data of the backbone network module is a radar high-resolution range profile sequence, the convolution attention module extracts local features of the radar high-resolution range profile, the position coding module performs position coding on the extracted local features, and the multi-head attention transformation coder module finishes attention coding on the features and outputs a recognition result.

The first and second convolution sub-networks constructed in accordance with the present invention will be further described with reference to fig. 3.

Building a first convolution sub-network consisting of a convolution layer, a batch normalization layer, a nonlinear activation layer and a maximum value pooling layer; the convolution layer has 8 convolution kernels with the kernel size of 7×7 pixels, the number of convolved padding is 3, the number of channels of the batch normalization layer is 8, the nonlinear activation layer adopts a linear rectification unit activation function, the window size of the maximum pooling layer is 2×2 pixels, and the step size is 2 pixels, as shown in fig. 3 (a).

Building a first convolution sub-network consisting of a convolution layer, a batch normalization layer, a nonlinear activation layer and a maximum value pooling layer; the convolution layer has 16 convolution kernels with a kernel size of 5×5 pixels, the number of convolved padding is 2, the number of channels of the batch normalization layer is 16, the nonlinear activation layer adopts a linear rectification unit activation function, the window size of the maximum pooling layer is 2×2 pixels, and the step size is 2 pixels, as shown in fig. 3 (b).

The channel attention layer and the spatial attention layer constructed by the present invention will be further described with reference to fig. 4.

Constructing a channel attention layer consisting of a global average attention pooling layer, a first convolution layer, a global maximum attention pooling layer, a second convolution layer and a nonlinear activation layer; wherein the output dimensions of the global average attention pooling layer and the global maximum attention pooling layer are 1×1, the first convolution layer has 4 convolution kernels with a kernel size of 3×3 pixels, the second convolution layer has 32 convolution kernels with a kernel size of 3×3 pixels, and the nonlinear activation layer employs a linear rectification unit activation function, as shown in fig. 4 (a).

Building a space attention layer consisting of an average attention pooling layer, a maximum attention pooling layer, a convolution layer and a nonlinear activation layer; the output of each of the average attention pooling layer and the maximum attention pooling layer is spliced in channel dimension and then is used as the total output of attention pooling, the number of output channels of the average attention pooling layer and the maximum attention pooling layer is 1, the convolution layer is provided with 1 convolution kernel with a kernel size of 3×3 pixels, the nonlinear activation layer adopts a linear rectification unit activation function, the window size of the maximum pooling layer is 2×2 pixels, and the step length is 2 pixels, as shown in fig. 4 (b).

Constructing an attention enhancement convolution sub-network consisting of a convolution layer, a channel attention layer, a space attention layer, a batch normalization layer, a nonlinear activation layer and a maximum value pooling layer; the convolution layer is provided with 32 convolution kernels with the kernel size of 3 multiplied by 3 pixels, the number of the convolved fillers is 1, the number of the channels of the batch normalization layer is 32, the nonlinear activation layer adopts a linear rectification unit activation function, the window size of the maximum value pooling layer is 2 multiplied by 2 pixels, and the step length is 2 pixels.

The first convolution sub-network, the second convolution sub-network and the attention-enhancing convolution sub-network are cascaded into a convolution attention module.

Building a position coding module consisting of a sine encoder and a cosine encoder; the encoding dimensions of the sine encoder and the cosine encoder are 32, the position index of the sine encoder is all even numbers selected from [0,96], and the position index of the cosine encoder is all odd numbers selected from [0,96 ].

Constructing a multi-head attention conversion encoder module consisting of multi-head attention groups and multi-layer perceptrons; the multi-head attention group consists of 8 attention heads connected in parallel, the characteristics after the position coding are duplicated to be a true value, a key value and a query value respectively, each attention head is calculated by a scaling dot product formula, the 8 attention heads are added to obtain the output of the multi-head attention group, wherein the lengths of the key value, the query value and the true value are 97, and the dimensionality is 32 pixels; the multi-layer perceptron is composed of a first full-connection layer, a Gaussian error linear unit and a second full-connection layer in cascade connection, wherein the output of the multi-head attention group is connected to the multi-layer perceptron for feature perception, and then connected to a softmax classifier to obtain a multi-head attention transform encoder module, and the weight dimension sizes of the first full-connection layer and the second full-connection layer are respectively set to be 32 multiplied by 128 and 128 multiplied by 32.

The scaling dot product formula is as follows:

wherein, head _i Represents the ith attention header, Σ represents the summation operation, M represents the total number of right-hand fractional ownership scores, M represents the number of weight scores, exp (-) represents the exponential operation with the base of the natural constant e, Q _m Represents the query value used in calculating the mth weight score, T represents the transpose operation, K _m Represents the key value used in calculating the mth weight score, d represents the dimension of the key value, N represents the total number of all the joint degree scores of the right index operation, N represents the serial number of the joint degree score, Q _n Representing the query value, K, used in calculating the nth joint degree score _n Representing the key value used in calculating the nth joint score, and V represents the true value used in calculating the mth weight score.

The softmax function is as follows:

wherein p is _t Representing the probability that the input image belongs to the t-th type, t=1, 2, …, M, exp (·) represents an exponential operation based on a natural constant e, O _v The output of the v-th neuron is represented, the value of v is equal to t, L represents the total number of the neurons, L represents the serial number of the neurons, O _l Representing the output of the first neuron.

And 3, training the attention transformation network.

Step 1, initializing weight parameters and bias parameters of all convolution layers, channel attention layers, space attention layers, multi-head attention groups and multi-layer perceptrons in an attention conversion network.

And 2, inputting the radar high-resolution distance image sample in the training sample set to a convolution attention module for local feature extraction to obtain fine local features.

And step 3, inputting the fine local features into a position coding module to obtain the features after position coding.

And step 4, inputting the position-coded characteristics into a multi-head attention transformation encoder module to obtain category labels of the attention transformation network.

And step 5, calculating the error of the attention transformation network according to the cross entropy loss function by using the class label of the attention transformation network and the label corresponding to the radar high-resolution range profile sample.

The cross entropy loss function is in the form of:

wherein F represents a cross entropy loss function, j represents a class of samples in the training set, and x _j Representing the true label corresponding to each sample in the training set, ln represents the logarithmic operation based on e, y _j Representing the output of the sequence adjustment network.

And step 6, the error of the attention transformation network is reversely propagated, and the weight parameters and bias parameters of each convolution layer, the channel attention layer, the space attention layer, the multi-head attention group and the multi-layer perceptron of the attention transformation network are updated according to a gradient descent method.

And step 7, repeating the calculation processes from the step 2 to the step 6 by using the weight parameters and the bias parameters of each convolution layer, the channel attention layer, the space attention layer, the multi-head attention group and the multi-layer perceptron of the updated attention conversion network, and stopping iteration after the error is stably converged to obtain the trained attention conversion network.

And 4, identifying the radar high-resolution range profile target.

And (3) preprocessing and sliding window processing are carried out on the radar high-resolution range profile to be recognized by adopting the same method as the steps (S2) and (S3), a sample set to be recognized formed by 22203 groups of sequence samples is obtained, the sample set to be recognized is input into a trained attention conversion network, and a class label is output.

The effects of the present invention are further described below in connection with simulation experiments.

1. And (5) simulating experimental conditions.

The simulation experiment hardware platform of the invention is: the processor is Intel Xeon E5-2650 CPU, the main frequency of the processor is 2.20GHz, the memory is 64GB, and the display card is NVIDIA Geforce GTX 1080ti.

The simulation experiment software platform is Windows 10 operating system, mtalab2018, python 3.6 and Pytorch 1.4.

2. Simulation experiment content and result analysis.

The simulation experiment of the invention is to respectively identify the three types of aircraft radar high-resolution range profiles by adopting the method of the invention and the traditional convolutional neural network under the same data set, so as to obtain the identification result.

The data set used in the simulation experiment is three types of aircraft radar high-resolution range images obtained by the radar under the conditions of 5520MHz center frequency, 400MHz signal bandwidth and 400Hz pulse repetition frequency, and the three types of aircraft targets are respectively: an-26, a trophy, and a jacob-42, wherein the an-26 and the trophy have 7 data segments and the jacob-42 has 5 data segments, respectively. Selecting 5 sections and 6 sections of an-26, 6 sections and 7 sections of a trophy and 2 sections and 5 sections of a jacobian-42 as training sample sets, and selecting 1-4 sections of the an-26, 1-5 sections of the trophy and 1, 3 and 4 sections of the jacobian-42 as test sample sets.

In the simulation experiment, the existing radar high-resolution range profile identification method based on the traditional convolutional neural network is taken from paper HRRP feature extraction and recognition method ofradar ground target using convolutional neural network published by Beicheng Ding, penghui Chen et al, the radar high-resolution range profile identification method based on the traditional convolutional neural network firstly carries out fast Fourier transform on the radar high-resolution range profile, then extracts the characteristics of the transformed high-resolution range profile, and inputs the extracted characteristics into the designed convolutional neural network to identify the high-resolution range profile.

Simulation experiment 1: the method is applied to identifying three types of aircraft radar high-resolution range profile targets, firstly, a training sample set is used for training a radar high-resolution range profile identification network based on an attention conversion network to obtain a trained radar high-resolution range profile identification network based on the attention conversion network, and then a test sample set is used for testing the trained radar high-resolution range profile identification network based on the attention conversion network.

The accuracy of the simulation experiment 1 identification was calculated by the following formula:

wherein c represents the recognition accuracy of the test sample set, R represents the total number of samples of the test sample set, h (·) represents the class discrimination function, t _r Representing test samplesTrue category labels, y, of the r-th test sample are collected _r Representing the output result of the attention conversion network corresponding to the (r) th test sample in the test sample set, when t _r And y _r When equal, h (t) _r ,y _r ) Equal to 1, otherwise, h (t _r ,y _r ) Equal to 0.

According to r=22203,

the recognition accuracy of the invention is calculated to be 96.82%.

Simulation experiment 2: the method comprises the steps of identifying three types of aircraft radar high-resolution range profiles by using a traditional convolutional neural network method, training a radar high-resolution range profile identification network based on the traditional convolutional neural network by using a training sample set to obtain a trained radar high-resolution range profile identification network based on the traditional convolutional neural network, and testing the trained radar high-resolution range profile identification network based on the traditional convolutional neural network by using a testing sample set.

The recognition accuracy of the simulation experiment 2 was calculated by the following formula:

wherein c represents the recognition accuracy of the test sample set, R represents the total number of samples of the test sample set, h (·) represents the class discrimination function, t _r True class label, y, representing the r-th test sample in a test sample set _r Representing the output result of the attention conversion network corresponding to the (r) th test sample in the test sample set, when t _r And y _r When equal, h (t) _r ,y _r ) Equal to 1, otherwise, h (t _r ,y _r ) Equal to 0.

According to r=22203,

the identification accuracy of the traditional convolutional neural network is 87.47 percent.

In summary, compared with the existing method, the method for identifying the radar high-resolution range profile based on the attention transformation network can effectively improve the identification performance of the radar high-resolution range profile.

Claims

1. A radar high-resolution range profile target recognition method based on an attention transformation network is characterized in that a convolution attention module consisting of a convolution sub-network and an attention-enhanced convolution sub-network is utilized to extract local features of a high-resolution range profile, then a position coding module is utilized to carry out position coding on the local features, and a multi-head attention transformation encoder module is utilized to carry out attention coding on the features after the position coding; the radar high-resolution range profile identification method comprises the following steps:

step 1, generating a training set:

step 2, constructing an attention transformation network;

(2i) Concatenating the convolution attention module, the position coding module, and the multi-head attention transform encoder module into an attention transform network;

step 3, training an attention transformation network:

step 4, identifying a radar high-resolution range profile target:

and (3) preprocessing and sliding window processing are carried out on the radar high-resolution range profile to be identified by adopting the same method as the steps (1 b) and (1 c), a sample set to be identified consisting of 22203 groups of sequence samples is obtained, the sample set to be identified is input into a trained attention conversion network, and a class label is output.

2. The method for recognizing a radar high-resolution range profile object based on an attention-transforming network according to claim 1, wherein the sub-network parameters in the step (2 a) are as follows: the convolution layer is provided with 8 convolution kernels with the kernel size of 7 multiplied by 7 pixels, the number of the convolved fillers is 3, the number of the channels of the batch normalization layer is 8, the nonlinear activation layer adopts a linear rectification unit activation function, the window size of the maximum value pooling layer is 2 multiplied by 2 pixels, and the step length is 2 pixels.

3. The method for recognizing a radar high-resolution range profile object based on an attention-transforming network according to claim 1, wherein the sub-network parameters in the step (2 b) are as follows: the convolution layer is provided with 16 convolution kernels with the kernel size of 5 multiplied by 5 pixels, the number of the convolved fillers is 2, the number of the channels of the batch normalization layer is 16, the nonlinear activation layer adopts a linear rectification unit activation function, the window size of the maximum value pooling layer is 2 multiplied by 2 pixels, and the step length is 2 pixels.

4. The method for recognizing a radar high-resolution range profile based on an attention-transforming network according to claim 1, wherein the scaling dot product formula in the step (2 h) is as follows:

5. The method for recognizing a radar high-resolution range profile object based on an attention-transform network according to claim 1, wherein the cross entropy loss function in step 3 has the form of:

wherein F represents a cross entropy loss function, j represents a class of samples in the training set, and x _j Representing the true label corresponding to each sample in the training set, ln represents the logarithmic operation based on e, y _j Representing the output of the attention deficit network.