CN108846409A

CN108846409A - Radar echo extrapolation model training method based on cyclic dynamic convolution neural network

Info

Publication number: CN108846409A
Application number: CN201810402200.8A
Authority: CN
Inventors: 李骞; 施恩; 马强; 马烁
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2018-04-28
Filing date: 2018-04-28
Publication date: 2018-11-20

Abstract

The invention discloses a radar echo extrapolation model training method based on a cyclic dynamic convolution neural network, which comprises the following steps: RDCNN offline training: the method comprises the steps of obtaining a training sample set through data preprocessing for a given radar echo image, initializing an RDCNN model, training the RDCNN by using the training sample set, and enabling the RDCNN to be converged through the processes of calculating an output value through network forward propagation and updating network parameters through network backward propagation.

Description

Radar Echo Extrapolation model training method based on circulation dynamic convolutional neural networks

Technical field

The invention belongs to surface weather observation technical fields in Atmospheric Survey, more particularly to based on circulation dynamic convolutional Neural The Radar Echo Extrapolation model training method of network.

Background technique

Nowcasting refers mainly to the weather forecast of 0~3 hour high-spatial and temporal resolution, and Main Prediction object includes strong drop The diastrous weathers such as water, strong wind, hail.Currently, many forecast systems all use Numerical Prediction Models, but due to numerical forecast In the presence of return slow (spin-up) has been forecast, Nowcasting ability is limited.New Generation Doppler Weather Radar has very high Sensitivity and resolution ratio, the spatial resolution of data information can reach 200~1000m, and temporal resolution can reach 2 ~15min.In addition, Doppler radar also has reasonable operating mode, comprehensive condition monitoring and fault warning, advanced Real-time Calibration System and Radar meteorology product algorithm abundant, the reliability of Nowcasting can be greatly improved.Nowadays, New Generation Doppler Weather Radar has become one of most effective tool of nowcasting, is faced using Doppler radar Nearly forecast is based primarily upon Radar Echo Extrapolation technology, i.e., according to current time radar observation result, thus it is speculated that radar return future Position and intensity, to realize the track prediction to strong convection system.

Traditional Radar Echo Extrapolation method is centroid tracking method and the cross-correlation technique based on maximum correlation coefficient (Tracking Radar Echoes by Correlation, TREC), but all there is certain deficiency, mass center in conventional method Tracing is only applicable to echo compared with strong, the lesser storm monomer of range, unreliable for the forecast of a wide range of precipitation；TREC is general Echo is considered as linear change, and echo variation is increasingly complex in reality, while such method is vulnerable in vector field Unordered vector disturbance.In addition, existing method is low to the utilization rate of Radar Data, and history Radar Data includes local weather system The important feature of system variation, has very high researching value.

For improve Radar Echo Extrapolation timeliness, and from a large amount of history Radar Data study radar return variation Machine learning method is introduced into Radar Echo Extrapolation by rule.Convolutional neural networks (Convolutional Neural Networks, CNNs) important branch as deep learning, it is widely used in image procossing, the fields such as pattern-recognition.The network Maximum feature is using part connection, weight is shared, down-sampling method, deformation, translation and overturning to input picture With stronger adaptability.For strong temporal correlation existing between radar return image, it is dynamic to design the circulation based on input State convolutional neural networks, which can dynamically change weighting parameter according to the radar echo map of input, and then predict extrapolation Image.Using history Radar Data training circulation dynamic convolutional neural networks, so that network is more fully extracted echo character, learn Echo changing rule is practised, for improving Radar Echo Extrapolation accuracy, optimization nowcasting effect is of great significance.

Summary of the invention

Goal of the invention：When the technical problem to be solved by the present invention is to be directed to the extrapolation of existing Radar Echo Extrapolation method Imitate it is short, it is insufficient to Radar Data utilization rate, propose a kind of radar return based on circulation dynamic convolutional neural networks (RDCNN) Extrapolation method is realized and shows CAPPI (Constant AltitudePlan Position to the high plane such as radar echo intensity Indicator, CAPPI) image outside forecast, include the following steps：

Step 1, data prediction：Training image collection is inputted, the every piece image concentrated to training image standardizes Processing, converts every piece image to 280 × 280 gray level image, obtains image collection, draw to gray level image set Point, construction includes the training sample set of TrainsetSize group sample；

Step 2, RDCNN is initialized：RDCNN structure is designed, the circulation dynamic sub-network of generating probability vector is configured to Sub-network RDSN and probabilistic forecasting layer PPL for predicting future time instance radar return, provides RDCNN's for off-line training step Initialization model；

Step 3, the training parameter of RDCNN is initialized：E-learning rate λ=0.0001 is enabled, what the training stage inputted every time Sample size BatchSize=10, most large quantities of frequency of training of training sample setCurrently Criticize frequency of training BatchNum=1, the maximum number of iterations IterationMax=40 of network training, current iteration number IterationNum=1；

Step 4, training sample is read：By the way of batch training, the training sample that training is obtained from step 1 every time is concentrated BatchSize group training sample is read, every group of training sample includes 5 width image { x₁,x₂,x₃,x₄, y }, wherein { x₁,x₂,x₃,x₄} As input image sequence, y is corresponding control label；

Step 5, propagated forward：The feature that input image sequence is extracted in RDSN obtains level probability vector HPV and hangs down Straight probability vector VPV；In probabilistic forecasting layer, the last piece image in input image sequence is successively rolled up with VPV, HPV phase Product, obtains the output forecast image of propagated forward；

Step 6, backpropagation：The error term that probability vector is acquired in PPL, further according to probability vector error term from rear To the preceding layer-by-layer error term for calculating each network layer in RDSN, and then calculate in each network layer error term to weighting parameter and partially The gradient for setting parameter utilizes obtained gradient updating network parameter；

Step 7, off-line training step controls：Whole control is carried out to the offline neural metwork training stage, is divided into following three Kind situation：

If training sample is concentrated there are still original training sample, i.e. BatchNum < BatchMax, then step is returned Rapid 4 continue to read BatchSize group training sample, carry out network training；

If training sample, which is concentrated, is not present original training sample, i.e. BatchNum=BatchMax, and current net Network the number of iterations is less than maximum number of iterations, i.e. IterationNum < IterationMax then enables BatchNum=1, returns Step 4 continues to read BatchSize group training sample, carries out network training；

If training sample, which is concentrated, is not present original training sample, i.e. BatchNum=BatchMax, and network changes Generation number reaches maximum number of iterations, i.e. IterationNum=IterationMax, then terminates RDCNN off-line training step, Obtain convergent RDCNN model.

Step 1 includes the following steps：

Step 1-1, sampling：Training image collection is inputted, the image that training image is concentrated is sequentially arranged, and whens waiting Between be spaced apart, time interval be 6 minutes, altogether include N_TrainWidth image determines TrainsetSize by following formula：

Wherein, Mod (N_Train, 4) and indicate N_TrainTo 4 modulus,Expression is not more thanMaximum integer, acquire After TrainsetSize, training image is retained by sampling and concentrates preceding 4 × TrainsetSize+1 width image, when sampling, which passes through, deletes Except training image concentrates last image to meet the requirements amount of images；

Step 1-2, normalized images：Image transformation, normalization operation, by original resolution are carried out to the image that sampling obtains The color image that rate is 2000 × 2000 is converted into the gray level image that resolution ratio is 280 × 280；

Step 1-3 constructs training sample set：Training sample set is constructed using the gray level image that step 1-2 is obtained, by gray scale Every four adjacent images in image set, i.e. { 4N+1,4N+2,4N+3,4N+4 } width image are as one group of list entries, and the [the 4th × (N+1)+1] width image is by cutting, and the part that the central resolution ratio of reservation is 240 × 240 is as corresponding sample to sighting target Label, for N group sampleIts make is as follows：

In above formula, G_4N+1Indicate gray level image concentrate 4N+1 width image, N is positive integer, and have N ∈ [0, TrainsetSize-1], Crop () indicates trimming operation, and the portion that original image center size is 240 × 240 is retained after cutting Point, finally obtain the training sample set comprising TrainsetSize group training sample.

Step 1-2 includes the following steps：

Step 1-2-1, image conversion：Gray level image is converted by the step 1-1 image sampled, is retained by cutting The image resolution ratio boil down to 280 × 280 after cutting is divided in the part that original image center resolution ratio is 560 × 560 The grayscale image that resolution is 280 × 280；

Step 1-2-2, data normalization：By the value of each of the grayscale image obtained in step 1-2-1 pixel from [0~255] is mapped to [0~1].

Step 2 includes the following steps：

Step 2-1, construction circulation dynamic sub-network RDSN：

Sub-network is made of 10 network layers, is followed successively by convolutional layer C1, down-sampling layer S1, hidden layer H1, volume from front to back Lamination C2, down-sampling layer S2, hidden layer H2, convolutional layer C3, down-sampling layer S3, hidden layer H3, convolutional layer C4, down-sampling layer S4, Hidden layer H4, convolutional layer C5, hidden layer H5 and classifier layer F1；

Step 2-2 constructs probabilistic forecasting layer PPL：

Dynamic convolutional layer DC1 and dynamic convolutional layer DC2 is constructed in probabilistic forecasting layer, the vertical probability vector that RDSN is exported Convolution kernel of the VPV as dynamic convolutional layer DC1, convolution kernel of the level probability vector HPV as dynamic convolutional layer DC2.

Step 2-1 includes the following steps：

Step 2-1-1 constructs convolutional layer：For convolutional layer l_C,l_C∈ { C1, C2, C3, C4, C5 }, determines the following contents：Volume The output characteristic pattern quantity of laminationConvolution kernelAnd offset parameterFor convolution kernel, it is thus necessary to determine that volume The width of product coreThe quantity of convolution kernelThe value is that convolutional layer inputs and exports characteristic pattern quantity Product, and according to Xavier initial method construct convolution kernel；For offset parameter, the output characteristic pattern number of quantity and this layer It measures identical；l_CLayer output characteristic pattern width be Value by convolutional layer l_CInput feature vector figure point The width of resolution and convolution kernelIt codetermines, i.e., Indicate convolutional layer l_CUpper one layer of convolutional layer output characteristic pattern width；

For convolutional layer C1, C1 layers of output characteristic pattern quantity OutputMaps is enabled^C1The width of=12, C1 layers of output characteristic pattern Spend OutputSize^C1=272, C1 layers of convolution kernel width KernelSize^C1=9, C1 layers of offset parameter bias^C1It is initialized as Zero, C1 layers of convolution kernel k^C1Quantity KernelNumber^C1=48, the initial value of each parameter is in convolution kernelRand () is for generating random number；

For convolutional layer C2, C2 layers of output characteristic pattern quantity OutputMaps are enabled^C2The width of=32, C2 layers of output characteristic pattern OutputSize^C2=128, C2 layers of convolution kernel width KernelSize^C2=9, C2 layers of offset parameter are initialized as zero, C2 layers Convolution kernel k^C2Quantity KernelNumber^C2=384, the initial value of each parameter is in convolution kernel

For convolutional layer C3, C3 layers of output characteristic pattern quantity OutputMaps are enabled^C3The width of=32, C3 layers of output characteristic pattern OutputSize^C3=56, C3 layers of convolution kernel width KernelSize^C3=9, C3 layers of offset parameter are initialized as zero, C3 layers Convolution kernel k^C3Quantity KernelNumber^C3=1024, the initial value of each parameter is in convolution kernel

For convolutional layer C4, C4 layers of output characteristic pattern quantity OutputMaps are enabled^C4The width of=32, C4 layers of output characteristic pattern OutputSize^C4=20, C4 layers of convolution kernel width KernelSize^C4=9, C4 layers of offset parameter are initialized as zero, C4 layers Convolution kernel k^C4Quantity KernelNumber^C4=1024, the initial value of each parameter is in convolution kernel

For convolutional layer C5, C5 layers of output characteristic pattern quantity OutputMaps are enabled^C5The width of=32, C5 layers of output characteristic pattern OutputSize^C5=4, C5 layers of convolution kernel width KernelSize^C5=7, C5 layers of offset parameter are initialized as zero, C5 layers Convolution kernel k^C5Quantity KernelNumber^C5=1024, the initial value of each parameter is in convolution kernel

Step 2-1-2 constructs hidden layer：For hidden layer l_H,l_H∈ { H1, H2, H3, H4, H5 }, determines the following contents：It is hidden Output characteristic pattern quantity containing layerConvolution kernelAnd offset parameterFor convolution kernel, need really Determine the width of convolution kernelThe quantity of convolution kernelIts value is that hidden layer inputs and exports spy The product of figure quantity is levied, and convolution kernel is constructed according to Xavier initial method；For offset parameter, quantity and hidden layer It is identical to export characteristic pattern quantity；l_HLayer output characteristic pattern width be It is defeated with corresponding convolutional layer The width for entering characteristic pattern is consistent；

For hidden layer H1, H1 layers of output characteristic pattern quantity OutputMaps is enabled^H1The width of=4, H1 layers of output characteristic pattern Spend OutputSize^H1=280, H1 layers of convolution kernel width KernelSize^H1=9, H1 layers of offset parameter bias^H1Zero is initialized as, H1 layers of convolution kernel k^H1Quantity KernelNumber^H1=48, the initial value of each parameter in convolution kernelRand () is for generating random number；

For hidden layer H2, H2 layers of output characteristic pattern quantity OutputMaps are enabled^H2The width of=8, H2 layers of output characteristic pattern OutputSize^H2=136, H2 layers of convolution kernel width KernelSize^H2=9, H2 layers of offset parameter are initialized as zero, H2 layers Convolution kernel k^H2Quantity KernelNumber^H2=256, the initial value of each parameter is in convolution kernel

For hidden layer H3, H3 layers of output characteristic pattern quantity OutputMaps are enabled^H3The width of=8, H3 layers of output characteristic pattern OutputSize^H3=64, H3 layers of convolution kernel width KernelSize^H3=9, H3 layers of offset parameter are initialized as zero, H3 layers Convolution kernel k^H3Quantity KernelNumber^H3=256, the initial value of each parameter is in convolution kernel

For hidden layer H4, H4 layers of output characteristic pattern quantity OutputMaps are enabled^H4The width of=8, H4 layers of output characteristic pattern OutputSize^H4=28, H4 layers of convolution kernel width KernelSize^H4=9, H4 layers of offset parameter are initialized as zero, H4 layers Convolution kernel k^H4Quantity KernelNumber^H4=256, the initial value of each parameter is in convolution kernel

For hidden layer H5, H5 layers of output characteristic pattern quantity OutputMaps are enabled^H5The width of=8, H5 layers of output characteristic pattern OutputSize^H5=10, H5 layers of offset parameter are initialized as zero, H5 layers and include 256 weighting parameter k^H5, each weight ginseng Several initial values are

Step 2-1-3 constructs down-sampling layer：In down-sampling layer do not include need training parameter, by down-sampling layer S1, The sampling core of S2, S3 and S4 are initialized asFor down-sampling layer l_S,l_S∈ { S1, S2, S3, S4 }, output Characteristic pattern quantityIt is consistent with the output characteristic pattern quantity of one layer of convolutional layer thereon, output characteristic pattern is wide DegreeIt is the 1/2 of the output characteristic pattern width of one layer of convolutional layer thereon, formula is expressed as follows：

Step 2-1-4, structural classification device layer：Classifier layer is made of a full articulamentum F1, and F1 layers of weighting parameter is Horizontal weighting parameter matrix W H and vertical weighting parameter matrix W V, size are 41 × 512, are enabled each in weighting parameter matrix The initial value of a parameter isOffset parameter is Horizontal offset parameter BH and vertical off setting parameter BV, It is initialized as 41 × 1 one-dimensional null vector.

Step 5 includes the following steps：

Step 5-1, RDSN calculate probability vector：It is extracted in a sub-network by the alternate treatment of convolutional layer and down-sampling layer The image sequence characteristic of input is handled in classifier layer by Softmax function, and level probability vector HPV and vertical is obtained Probability vector VPV；

Step 5-2, PPL export forecast image：Convolution kernel of the HPV and VPV that step 5-1 is obtained as probabilistic forecasting layer, By the last piece image in input image sequence successively with VPV, HPV phase convolution, the output forecast image of propagated forward is obtained.

Step 5-1 includes the following steps：

Step 5-1-1, judges network layer type：Indicate the network layer in current RDSN with l, the value of l be followed successively by H1, C1, S1, H2, C2, S2, H3, C3, S3, H4, C4, S4, H5, C5, F1 }, initial value H1；The type of network layer l is judged, if l ∈ { H1, H2, H3, H4, H5 }, then l is hidden layer, executes step 5-1-2；If l ∈ { C1, C2, C3, C4, C5 }, then l is convolution Layer executes step 1-5-1-3；If l ∈ { S1, S2, S3, S4 }, then l is down-sampling layer, executes step 5-1-4；If l=F1, l For classifier layer, step 5-1-5 is executed；The output characteristic pattern of this training convolutional layer is denoted as a in training process_C', wherein C ∈ { C1, C2, C3, C4, C5 }, a_C' initial value be null matrix；

Step 5-1-2 handles hidden layer：There is l=l at this time_H,l_H∈ { H1, H2, H3, H4, H5 } is divided into two kinds of situations：

Work as l_HWhen ∈ { H1, H2, H3, H4 }, calculating l first_HJ-th of output characteristic pattern of layerIf l_H=H1, then C= C1, by zero pixel filling by a_C' in corresponding characteristic pattern width expand toIt is again that it is corresponding with this layer Convolution nuclear phase convolution, convolution results are summed, and summed result adds l_HJ-th of offset parameter of layerIt is activated by ReLU Function processing, obtains l_HJ-th of output characteristic pattern of layerCalculation formula is as follows：

In above formula, Expand_Zero () indicates zero extended function,For l_HLayer i-th of input feature vector figure and j-th The corresponding convolution kernel of characteristic pattern is exported,For l_HJ-th of biasing of layer, nh are the input feature vector figure of current hidden layer Number,Indicate l_HI-th of input feature vector figure of layer,Value by input feature vector figure width and convolution kernel Size determines, and has

Work as l_HWhen=H5, H5 layers of j-th of output characteristic pattern is calculated firstBy zero pixel filling by a_C5' feature Figure resolution ratio is expanded to 10 × 10, then the corresponding weighting parameter of itself and this layer is multiplied, and calculated result is summed, summed result adds J-th of offset parameter of upper H5 layerIt handles, obtains by ReLU activation primitiveCalculation formula is as follows：

In above formula,For H5 layers of i-th of input feature vector figure weighting parameter corresponding with j-th of output characteristic pattern；

Successively calculate l_HAll output characteristic pattern of layer, obtains l_HThe output characteristic pattern of layerL is updated to l+1, and is returned It returns step 5-1-1 and judges network type, carry out the operation of next network layer；

Step 5-1-3 handles convolutional layer：There is l=l at this time_C,l_C∈ { C1, C2, C3, C4, C5 }, first calculating l_CThe of layer J output characteristic patternBy l_CThe input feature vector figure convolution nuclear phase convolution corresponding with this layer respectively of layer, convolution results are asked L is added with, summed result_CJ-th of offset parameter of layerIt handles, obtains using ReLU activation primitiveIt calculates public Formula is as follows：

In above formula,For l_CI-th of input feature vector figure convolution kernel corresponding with j-th of output characteristic pattern of layer, nc are The input feature vector figure number of convolutional layer,Indicate l_CI-th of input feature vector figure of layer, while being also l_C- 1 layer of i-th of output Characteristic pattern, * representing matrix convolution, if l_C=C1, then l_C- 1 layer is input layer；

Successively calculate l_CAll output characteristic pattern of layer, obtains l_CThe output characteristic pattern of layerWithValue update a_C'(l_C =C, such as work as l_CWhen=C1, then a is used^C1Update a_C1'), l is updated to l+1, judges network class for simultaneously return step 1-5-1-1 Type carries out the operation of next network layer；

Step 5-1-3 handles down-sampling layer：There is l=l at this time_S,l_S∈ { S1, S2, S3, S4 }, step 5-1-2 is obtained The output characteristic pattern of convolutional layer respectively withPhase convolution, then sampled with step-length for 2, sampling obtains l_SLayer Export characteristic patternCalculation formula is as follows：

In above formula, Sample () indicates that step-length is 2 sampling processing, l_S- 1 indicates the previous convolution of current down-sampling layer Layer,Indicate l_SThe output characteristic pattern of layerIn j-th of output characteristic pattern, obtain l_SThe output characteristic pattern of layerAfterwards, more by l It is newly l+1, and return step 5-1-1 judges network type, carries out the operation of next network layer；

Step 5-1-4 calculates F1 layers of probability vector：There is l=F1 at this time, by matrixing, by the 32 width resolution ratio of C5 For 4 × 4 output characteristic pattern with column sequential deployment, obtain the output feature vector a for the F1 layer that resolution ratio is 512 × 1^F1, respectively Calculate horizontal weighting parameter matrix W H and a^F1Apposition, vertical weighting parameter matrix W V and a^F1Apposition, calculated result is distinguished With Horizontal offset parameter BH, vertical off setting parameter BV sum, through Softmax function processing after obtain level probability vector HPV and Vertical probability vector VPV, specific formula for calculation are as follows：

By its vertical probability vector VPV transposition, final vertical probability vector is obtained.

Step 5-2 includes the following steps：

Step 5-2-1 predicts DC1 layers of vertical direction：By last width input picture of input layer and vertical probability vector VPV phase convolution obtains the DC1 layer that resolution ratio is 240 × 280 and exports characteristic pattern a^DC1；

Step 5-2-2 predicts DC2 layers of vertical direction：By DC1 layers of output characteristic pattern a^DC1It is rolled up with level probability vector HPV phase Product, obtains the output forecast image of propagated forward, and resolution ratio is 240 × 240.

Step 6 includes the following steps：

Step 6-1 calculates PPL error term：By pair in the step 5-2-2 forecast image obtained and the training sample of input Sighting target label ask poor, calculate the error term of DC2 layers, DC1 layers, finally acquire the error term δ of level probability vector^HPVWith vertical probability The error term δ of vector^VPV；

Step 6-2 calculates RDSN error term：According to the error term δ of level probability vector^HPVWith the mistake of vertical probability vector Poor item δ^VPV, classification layer F1, convolutional layer (C5, C4, C3, C2, C1) hidden layer (H5, H4, H3, H2, H1) are successively calculated from rear to preceding With the error term of down-sampling layer (S4, S3, S2, S1), the resolution ratio of any layer error term matrix acquired and the output of this layer are special The resolution ratio for levying figure is consistent；

Step 6-3 calculates gradient：The error term of each network layer in RDSN is calculated according to the error term that step 6-2 is obtained To the gradient value of this layer of weighting parameter and offset parameter；

Step 6-4, undated parameter：By the gradient value of the weighting parameter of the step 6-3 each network layer obtained and offset parameter It is multiplied by the learning rate of RDCNN, obtains the update item of each network layer weighting parameter and offset parameter, former weighting parameter and biasing are joined Number asks poor with the update item respectively, obtains updated weighting parameter and offset parameter.

Step 6-1 includes the following steps：

Step 6-1-1 calculates dynamic convolutional layer DC2 error term：The forecast image that step 5-2-2 is obtained and this group of sample Control label ask poor, obtain size be 240 × 240 error term matrix delta^DC2；

Step 6-1-2 calculates dynamic convolutional layer DC1 error term：By zero padding by DC2 layers of error term matrix delta^DC2It opens up Exhibition is 240 × 320, by level probability Vector rotation 180 degree, by the error term matrix after expansion and the level probability after overturning to Phase convolution is measured, DC1 layers of error term matrix delta is obtained^DC1, size is 240 × 280, and formula is as follows：

δ^DC1=Expand_Zero (δ^DC2) * rot180 (HPV),

In above formula, rot180 () indicates that angle is 180 ° of rotation function, and 2 × 2 matrix zero is extended for 4 × 4 Matrix, the matrix after zero expansion, the region that central resolution ratio is 2 × 2 is consistent with original matrix, remaining position is filled out with zero pixel It fills；

Step 6-1-3 calculates probability vector error term：The error term for calculating level probability vector HPV, it is defeated by DC1 layers Characteristic pattern and error term matrix delta out^DC2Phase convolution obtains 1 × 41 row vector after convolution, which is the error term δ of HPV^HPV, Formula is as follows：

δ^HPV=a^DC1*δ^DC2,

The error term for calculating vertical probability vector VPV, by the input feature vector figure of input layer and error term matrix delta^DC1Mutually roll up It is long-pending, 41 × 1 column vector is obtained after convolution, which is the error term δ of VPV^VPV, formula is as follows：

In above formula,For the last piece image in the input image sequence of training sample；

Step 6-2 includes the following steps：

Step 6-2-1 calculates classifier layer F1 error term：By the error term δ of the step 6-1-3 probability vector obtained^VPVWith δ^HPVWeighting parameter matrix W V vertical with F1 layers and horizontal weighting parameter matrix W H carry out matrix multiple respectively, then by matrix Apposition is summed and is averaged, and F1 layers of error term δ is obtained^F1, formula is as follows：

In above formula, × representing matrix apposition, ()^TThe transposition for representing matrix, obtained δ^F1Size be 512 × 1；

Step 6-2-2 calculates convolutional layer C5 error term：By matrixing, by the mistake of the F1 layer obtained in step 6-2-1 Poor item δ^F1It is transformed to the matrix that 32 resolution ratio are 4 × 4Obtain C5 layers of error term δ^C5,It indicates to become The matrix that the 32nd resolution ratio after changing is 4 × 4；

Step 6-2-3, judges network layer type：The network layer in the RDSN being presently in is indicated with l, and the value of l is successively For { H5, S4, C4, H4, S3, C3, H3, S2, C2, H2, S1, C1, H1 }, l initial value is H5, the type of network layer l is judged, if l ∈ { H5, H4, H3, H2, H1 }, then l is hidden layer, executes step 6-2-4；If l ∈ { S4, S3, S2, S1 }, then l is down-sampling Layer executes step 6-2-5, if l ∈ { C4, C3, C2, C1 }, then l is convolutional layer, executes step 6-2-6；

Step 6-2-4 calculates hidden layer error term：L=l at this time_H,l_H∈ { H5, H4, H3, H2, H1 } calculates l_HThe of layer I error term matrixBy zero padding respectively by each error term matrix delta of l+1 layers of convolutional layer^l+1It expands to width and is ExpandSize^l+1：

ExpandSize^l+1=OutputSize^l+1+2·(KernelSize^l+1- 1),

Again by corresponding convolution kernel180 degree is rotated, then rolls up the matrix after expansion and the convolution nuclear phase after overturning Product, and convolution results are summed, obtain l_HI-th of error term matrix of layerFormula is as follows：

In above formula, nc indicates the error term number of l+1 layers of convolutional layer, numerical value and l+1 layers of output characteristic pattern quantity phase Together, and there is nc=OutputMaps^l+1；

All error term matrixes are successively calculated, l is obtained_HThe output characteristic pattern of layerL is updated to l-1, and is returned Step 6-2-3 judges network type, calculates the error term of a upper network layer；

Step 6-2-5 calculates down-sampling layer error term：L=l at this time_S,l_S∈ { S4, S3, S2, S1 } calculates l_SThe i-th of layer A error term matrixBy zero padding respectively by each error term matrix delta of l+2 layers of convolutional layer^l+2It expands to width and is ExpandSize^l+2：

ExpandSize^l+2=OutputSize^l+2+2·(KernelSize^l+2- 1),

Again by corresponding convolution kernel180 degree is rotated, then rolls up the matrix after expansion and the convolution nuclear phase after overturning Product, and convolution results are summed, obtain l_SI-th of error term matrix of layerFormula is as follows：

In above formula, nc indicates the error term number of l+2 layers of convolutional layer, numerical value and l+2 layers of output characteristic pattern quantity phase Together, and there is nc=OutputMaps^l+2；

All error term matrixes are successively calculated, l is obtained_SThe output characteristic pattern δ of layer^lS, l is updated to l-1, and return to step Rapid 6-2-3 judges network type, calculates the error term of a upper network layer；

Step 6-2-6 calculates convolutional layer error term：There is l=l at this time_C,l_C∈ { C4, C3, C2, C1 }, due to step 6-2-3 The initial value of middle l is H5, therefore is not in l_CThe case where=C5, for l_CI-th of error term matrix of layerFirst to l+1 Corresponding i-th of error term matrix in layer down-sampling layerIt is up-sampled, it will when up-samplingIn each element mistake For poor entry value average mark into sampling area, obtaining resolution ratio is OutputSize^lC×OutputSize^lCUp-sampling matrix, then Activation primitive is calculated in l_CThe inner product of derivative and the up-sampling matrix acquired at layer character pair figure, obtains l_CI-th of mistake of layer Poor item matrixFormula is as follows：

In above formula, representing matrix inner product, ReLU'() indicate the derivative of ReLU activation primitive, form is as follows：

UpSamlpe () indicates up-sampling function, the corresponding up-sampling of each of original image pixel after up-sampling Region in each of original pixel value mean allocation to sampling area pixel, successively calculates all error term matrixes, obtains To l_CThe output characteristic pattern of layer

Step 6-2-7, l layers are convolutional layer, i.e. l=l at this time_C, it is divided into two kinds of situations later：

If l ≠ C1, l is updated to l-1, and return step 6-2-3 judges network type, calculates a upper network layer Error term；

If l=C1, the calculating of step 6-2 sub-network error term terminates；

Step 6-3 includes the following steps：

Step 6-3-1 calculates convolutional layer error term to the gradient of convolution kernel：Use l_CIndicate currently processed convolutional layer, l_C∈ { C1, C2, C3, C4, C5 } successively calculates each convolutional layer error term to the gradient of convolution kernel, by the i-th of convolutional layer since C1 layers A input feature vector figureWith l_CJ-th of error term matrix of layerPhase convolution, convolution results are the gradient of corresponding convolution kernel ValueFormula is as follows：

In above formula,WithRespectively indicate l_CThe output characteristic pattern number and l of layer_C- 1 layer of output characteristic pattern number；

Step 6-3-2 calculates each convolutional layer error term to the gradient of biasing：Use l_CIndicate currently processed convolutional layer, l_C∈ { C1, C2, C3, C4, C5 } successively calculates each convolutional layer error term to the gradient of biasing, by l since C1 layers_CJ-th of mistake of layer Poor item matrixIn all elements sum, obtain j-th of this layer biasing gradient valueFormula is as follows：

In above formula, Sum () expression sums to all elements of matrix；

Step 6-3-3 calculates hidden layer error term to the gradient of convolution kernel：Use l_HIndicate currently processed hidden layer, l_H∈ { H1, H2, H3, H4, H5 } successively calculates each convolutional layer error term to the gradient of convolution kernel, first to hidden layer since H1 layers Error term is cut, and is retained central width and isPart, be denoted asWork as l_H=H5 When, retain the part that H5 layers of error term center width are 4 × 4, then by i-th of input feature vector figure of hidden layerWith's J-th of component phase convolution, convolution results are the gradient value of corresponding convolution kernelFormula is as follows：

In above formula,WithRespectively indicate l_HThe output characteristic pattern number and l of layer_H- 1 layer of output characteristic pattern number；

Step 6-3-4 calculates each hidden layer error term to the gradient of biasing：Use l_HIndicate currently processed hidden layer, l_H∈ { H1, H2, H3, H4, H5 } successively calculates each convolutional layer error term to the gradient of biasing since H1 layers, will obtain in step 6-3-3 It arrivesAll elements in j-th of component are summed, and the gradient value of j-th of this layer biasing is obtainedFormula is as follows It is shown：

In above formula, Sum () expression sums to all elements of matrix；

Step 6-3-5 calculates F1 layers of error term to the gradient of weighting parameter：Calculate separately level probability vector with it is vertical general The error term δ of rate vector^HPV、δ^VPVWith F1 layers of error term δ^F1Inner product, calculated result be F1 layers of error term to weighting parameter WH, WV Gradient value, formula is as follows：

▽ WH=(δ^HPV)^T×(δ^F1)^T,

▽ WV=δ^VPV×(δ^F1)^T,

In above formula, ▽ WH is gradient value of the error term to horizontal weighting parameter, and ▽ WV is error term to vertical weighting parameter Gradient value；

Step 6-3-6 calculates F1 layers of error term to the gradient of offset parameter：By level probability vector and vertical probability vector Error term δ^HPV、δ^VPVIt is public respectively as F1 layers of error term to the gradient value of Horizontal offset parameter BH and vertical off setting parameter BV Formula is as follows：

▽ BH=(δ^HPV)^T,

▽ BV=δ^VPV,

In above formula, ▽ BH is gradient value of the error term to Horizontal offset parameter, and ▽ BV is error term to vertical off setting parameter Gradient value；

Step 6-4 includes the following steps：

Step 6-4-1 updates each convolutional layer weighting parameter：Each convolutional layer error term that step 6-3-1 is obtained is to convolution The gradient of core is multiplied by the learning rate of RDCNN, obtains the correction term of convolution kernel, then former convolution kernel and the correction term are asked poor, obtains The convolution kernel of updateFormula is as follows：

Step 6-4-2 updates each convolutional layer offset parameter：Each convolutional layer error term that step 6-3-2 is obtained is to biasing Gradient be multiplied by the learning rate of RDCNN, obtain the correction term of offset parameter, then former bias term and the correction term are asked poor, obtain The bias term of updateFormula is as follows：

Step 6-4-3 updates each hidden layer weighting parameter：Each hidden layer error term that step 6-3-3 is obtained is to convolution The gradient of core is multiplied by the learning rate of RDCNN, obtains the correction term of convolution kernel, then former convolution kernel and the correction term are asked poor, obtains The convolution kernel of updateFormula is as follows：

Step 6-4-4 updates each hidden layer offset parameter：Each hidden layer error term that step 6-3-4 is obtained is to biasing Gradient be multiplied by the learning rate of RDCNN, obtain the correction term of offset parameter, then former bias term and the correction term are asked poor, obtain The bias term of updateFormula is as follows：

Step 6-4-5 updates F1 layers of weighting parameter：The F1 layer error term that step 6-3-5 is obtained to weighting parameter WH and The gradient value of WV is multiplied by the learning rate of RDCNN, obtains the correction term of weighting parameter, then by former weighting parameter WH and WV respectively with ask The correction term obtained asks poor, the WH and WV updated, and formula is as follows：

WH=WH- λ ▽ WH,

WV=WV- λ ▽ WV；

Step 6-4-6 updates F1 layers of offset parameter：The F1 layer error term that step 6-3-6 is obtained to offset parameter BH and The gradient value of BV is multiplied by the learning rate of RDCNN, obtains the correction term of offset parameter, then by former offset parameter BH and BV respectively with ask The correction term obtained asks poor, the BH and BV updated, and formula is as follows：

BH=BH- λ ▽ BH,

BV=BV- λ ▽ BV.

Beneficial effect：The present invention realizes Radar Echo Extrapolation using convolutional neural networks (CNN) image processing techniques, proposes A kind of circulation dynamic convolutional neural networks (RDCNN) structure, the network is by circulation dynamic sub-network (RDSN) and probabilistic forecasting Layer (PPL) composition, has dynamic characteristic and cycle characteristics.The convolution kernel of PPL is calculated by RDSN, the radar return with input There are mapping relations for image, therefore the convolution kernel still is able to the difference according to input in the RDCNN on-line testing stage and changes, Make network that there is dynamic characteristic；RDSN increases hidden layer on the basis of traditional CNN model, by hidden layer and convolutional layer structure It at loop structure, can recursively retain history training information by loop structure, make network that there is cycle characteristics.Using a large amount of Radar return image data train RDCNN, make network convergence, trained network can preferably realize Radar Echo Extrapolation.

Detailed description of the invention

The present invention is done with reference to the accompanying drawings and detailed description and is further illustrated, it is of the invention above-mentioned or Otherwise advantage will become apparent.

Fig. 1 is flow chart of the present invention.

Fig. 2 is circulation dynamic convolutional neural networks initialization model structure chart.

Fig. 3 is circulation dynamic sub-network structural map.

Fig. 4 is probabilistic forecasting layer structural map.

Fig. 5 is that matrix zero expands schematic diagram.

Fig. 6 is the process schematic up-sampled to 2 × 2 matrix.

Specific embodiment

The present invention will be further described with reference to the accompanying drawings and embodiments.

As shown in Figure 1, the invention discloses the Radar Echo Extrapolation model trainings based on circulation dynamic convolutional neural networks Method includes the following steps：

Step 1, dynamic convolutional neural networks RDCNN off-line training is recycled：Input training image collection, to training image collection into Line number Data preprocess obtains training sample set, designs RDCNN structure, and initialize network training parameter；Utilize training sample set The orderly image sequence of training RDCNN, input obtain a width forecast image by propagated forward, calculate forecast image and to sighting target Error between label updates the weighting parameter and offset parameter of network by backpropagation, repeats this process until prediction result Reach trained termination condition, obtains convergent RDCNN model；

Step 2, RDCNN on-line prediction：Input test image set carries out data prediction to test chart image set, is surveyed Sample set is tried, then by the RDCNN model obtained in test sample collection input step 1, is calculated by network propagated forward general Rate vector, and by input image sequence last width radar return image and obtained probability vector phase convolution, obtain pre- The Radar Echo Extrapolation image of survey.

Step 1 includes the following steps：

Step 1-1, data prediction：Training image collection is inputted, the every piece image concentrated to training image standardizes Change processing, converts every piece image to 280 × 280 gray level image, obtains image collection, draw to gray level image set Point, construction includes the training sample set of TrainsetSize group sample；

Step 1-2 initializes RDCNN：RDCNN structure is designed, the circulation dynamic subnet of generating probability vector is configured to String bag network (Recurrent Dynamic Sub-network, RDSN) and the probability for predicting future time instance radar return Prediction interval (Probability Prediction Layer, PPL), provides the initialization model of RDCNN for off-line training step, As shown in Fig. 2, for circulation dynamic convolutional neural networks initialization model structure chart；

Step 1-3 initializes the training parameter of RDCNN：E-learning rate λ=0.0001 is enabled, the training stage inputs every time Sample size BatchSize=10, most large quantities of frequency of training of training sample setCurrently Criticize frequency of training BatchNum=1, the maximum number of iterations IterationMax=40 of network training, current iteration number IterationNum=1；

Step 1-4 reads training sample：By the way of batch training, the training sample obtained from step 1-1 is trained every time It concentrates and reads BatchSize group training sample, every group of training sample is { x₁,x₂,x₃,x₄, y }, it altogether include 5 width images, wherein {x₁,x₂,x₃,x₄It is used as input image sequence, y is corresponding control label；

Step 1-5, propagated forward：In RDSN extract input image sequence feature, obtain level probability vector HPV and Vertical probability vector VPV；In probabilistic forecasting layer, by the last piece image in input image sequence successively with VPV, HPV phase Convolution obtains the output forecast image of propagated forward；

Step 1-6, backpropagation：The error term that probability vector is acquired in PPL, further according to probability vector error term from Afterwards to the preceding layer-by-layer error term for calculating each network layer in RDSN, so calculate in each network layer error term to weighting parameter and The gradient of offset parameter utilizes obtained gradient updating network parameter；

Step 1-7, off-line training step control：Whole control is carried out to the offline neural metwork training stage, is divided into following Three kinds of situations：

If training sample is concentrated there are still original training sample, i.e. BatchNum < BatchMax, then step is returned Rapid 1-4 continues to read BatchSize group training sample, carries out network training；

If training sample, which is concentrated, is not present original training sample, i.e. BatchNum=BatchMax, and current net Network the number of iterations is less than maximum number of iterations, i.e. IterationNum < IterationMax then enables BatchNum=1, returns Step 1-4 continues to read BatchSize group training sample, carries out network training；

Step 1-1 data prediction includes the following steps：

Step 1-1-1, sampling：The image that training image is concentrated is sequentially arranged, and constant duration is distributed, when Between between be divided into 6 minutes, altogether include N_TrainWidth image determines TrainsetSize by following formula：

In above formula, Mod (N_Train, 4) and indicate N_TrainTo 4 modulus,Expression is not more thanMaximum integer, ask After TrainsetSize, training image is retained by sampling and concentrates preceding 4 × TrainsetSize+1 width image, when sampling passes through Deleting training image concentrates last image to meet the requirements amount of images；

Step 1-1-2, normalized images：Image transformation, normalization operation, by original point are carried out to the image that sampling obtains The color image that resolution is 2000 × 2000 is converted into the gray level image that resolution ratio is 280 × 280；

Step 1-1-3 constructs training sample set：Training sample set is constructed using the gray level image that step 1-1-2 is obtained, it will Gray level image concentrates every four adjacent images, i.e. { 4N+1,4N+2,4N+3,4N+4 } width image as one group of list entries, [4 × (N+1)+1] width image is by cutting, control of the part that the central resolution ratio of reservation is 240 × 240 as corresponding sample Label, for N group sampleIts make is as follows：

In above formula, G_4N+1Indicate gray level image concentrate 4N+1 width image, N is positive integer, and have N ∈ [0, TrainsetSize-1], Crop () indicates trimming operation, and the portion that original image center size is 240 × 240 is retained after cutting Point, finally obtain the training sample set comprising TrainsetSize group training sample；

Wherein, step 1-1-2 includes the following steps：

Step 1-1-2-1, image conversion：Gray level image is converted by the step 1-1-1 image sampled, passes through cutting Retain the part that original image center resolution ratio is 560 × 560 to obtain the image resolution ratio boil down to 280 × 280 after cutting The grayscale image for being 280 × 280 to resolution ratio；

Step 1-1-2-2, data normalization：By each of the grayscale image obtained in step 1-1-2-1 pixel Value is mapped to [0~1] from [0~255].

Step 1-2 includes the following steps：

Step 1-2-1, construction circulation dynamic sub-network RDSN, as shown in figure 3, for circulation dynamic sub-network structural map：

Step 1-2-2 constructs probabilistic forecasting layer PPL, as shown in figure 4, being probabilistic forecasting layer structural map：

Dynamic convolutional layer DC1 and dynamic convolutional layer DC2 is constructed in probabilistic forecasting layer, the vertical probability vector that RDSN is exported Convolution kernel of the VPV as dynamic convolutional layer DC1, convolution kernel of the level probability vector HPV as dynamic convolutional layer DC2；

Wherein, step 1-2-1 includes the following steps：

Step 1-2-1-1 constructs convolutional layer：For convolutional layer l_C,l_C∈ { C1, C2, C3, C4, C5 }, determines the following contents：Volume The output characteristic pattern quantity of laminationConvolution kernelAnd offset parameterFor convolution kernel, it is thus necessary to determine that volume The width of product coreThe quantity of convolution kernelThe value is that convolutional layer inputs and exports characteristic pattern quantity Product, and according to Xavier initial method construct convolution kernel；For offset parameter, the output characteristic pattern number of quantity and this layer It measures identical；l_CLayer output characteristic pattern width be Value by convolutional layer l_CInput feature vector figure point The width of resolution and convolution kernelIt codetermines, i.e., Indicate convolutional layer l_CUpper one layer of convolutional layer output characteristic pattern width；

Step 1-2-1-2 constructs hidden layer：For hidden layer l_H,l_H∈ { H1, H2, H3, H4, H5 }, determines the following contents： The output characteristic pattern quantity of hidden layerConvolution kernelAnd offset parameterFor convolution kernel, need Determine the width of convolution kernelThe quantity of convolution kernelIts value is that hidden layer is inputted and exported The product of characteristic pattern quantity, and convolution kernel is constructed according to Xavier initial method；For offset parameter, quantity and hidden layer Output characteristic pattern quantity it is identical；l_HLayer output characteristic pattern width be With corresponding convolutional layer The width of input feature vector figure is consistent；

For hidden layer H1, H1 layers of output characteristic pattern quantity OutputMaps is enabled^H1The width of=4, H1 layers of output characteristic pattern Spend OutputSize^H1=280, H1 layers of convolution kernel width KernelSize^H1=9, H1 layers of offset parameter bias^H1Zero is initialized as, H1 layers of convolution kernel k^H1Quantity KernelNumber^H1=48, the initial value of each parameter is in convolution kernelRand () is for generating random number；

For hidden layer H5, H5 layers of output characteristic pattern quantity OutputMaps are enabled^H5The width of=8, H5 layers of output characteristic pattern OutputSize^H5=10, H5 layers of offset parameter are initialized as zero.H5 layers include 256 weighting parameter k^H5, each weight ginseng Several initial values are

Step 1-2-1-3 constructs down-sampling layer：In down-sampling layer do not include need training parameter, by down-sampling layer S1, The sampling core of S2, S3 and S4 are initialized asFor down-sampling layer l_S,l_S∈ { S1, S2, S3, S4 }, output Characteristic pattern quantityIt is consistent with the output characteristic pattern quantity of one layer of convolutional layer thereon, output characteristic pattern is wide DegreeIt is the 1/2 of the output characteristic pattern width of one layer of convolutional layer thereon, formula is expressed as follows：

Step 1-2-1-4, structural classification device layer：Classifier layer is made of a full articulamentum F1, F1 layers of weighting parameter For horizontal weighting parameter matrix W H and vertical weighting parameter matrix W V, size is 41 × 512, is enabled every in weighting parameter matrix The initial value of one parameter isOffset parameter is Horizontal offset parameter BH and vertical off setting parameter BV, It is initialized as 41 × 1 one-dimensional null vector.

Step 1-5 includes the following steps：

Step 1-5-1, RDSN calculate probability vector：It is mentioned in a sub-network by the alternate treatment of convolutional layer and down-sampling layer The image sequence characteristic for taking input is handled in classifier layer by Softmax function, is obtained level probability vector HPV and is hung down Straight probability vector VPV；

Step 1-5-2, PPL export forecast image：Convolution of the HPV and VPV that step 1-5-1 is obtained as probabilistic forecasting layer Core obtains the output prognostic chart of propagated forward by the last piece image in input image sequence successively with VPV, HPV phase convolution Picture.

Step 1-5-1 includes the following steps：

Step 1-5-1-1, judges network layer type：Indicate that the network layer in current RDSN, the value of l are followed successively by with l { H1, C1, S1, H2, C2, S2, H3, C3, S3, H4, C4, S4, H5, C5, F1 }, initial value H1.Judge the class of network layer l Type, if l ∈ { H1, H2, H3, H4, H5 }, then l is hidden layer, executes step 1-5-1-2；If l ∈ { C1, C2, C3, C4, C5 }, then L is convolutional layer, executes step 1-5-1-3；If l ∈ { S1, S2, S3, S4 }, then l is down-sampling layer, executes step 1-5-1-4；If L=F1, then l is classifier layer, executes step 1-5-1-5.By the output feature seal of this training convolutional layer in training process For a_C', wherein C ∈ { C1, C2, C3, C4, C5 }, a_C' initial value be null matrix；

Step 1-5-1-2 handles hidden layer：There is l=l at this time_H,l_H∈ { H1, H2, H3, H4, H5 } is divided into two kinds of feelings at this time Condition：

Work as l_HWhen ∈ { H1, H2, H3, H4 }, calculating l first_HJ-th of output characteristic pattern of layerPass through zero pixel filling By a_C' in corresponding characteristic pattern (if l_H=H1, then C=C1) width expand toIt is again that it is corresponding with this layer Convolution nuclear phase convolution, convolution results are summed, and summed result adds l_HJ-th of offset parameter of layerIt is activated by ReLU Function processing, obtainsCalculation formula is as follows：

In above formula, Expand_Zero () indicates zero extended function, as shown in figure 5, expand schematic diagram for matrix zero, For l_HI-th of input feature vector figure convolution kernel corresponding with j-th of output characteristic pattern of layer, nh are that the input of current hidden layer is special Figure number is levied,Indicate l_HI-th of input feature vector figure of layer,Value by input feature vector figure width and volume The size of product core determines, and has

Successively calculate l_HAll output characteristic pattern of layer, obtainsL is updated to l+1, and return step 1-5-1-1 sentences Circuit network type carries out the operation of next network layer；

Step 1-5-1-3 handles convolutional layer：There is l=l at this time_C,l_C∈ { C1, C2, C3, C4, C5 }, first calculating l_CLayer J-th of output characteristic patternBy l_CThe input feature vector figure convolution nuclear phase convolution corresponding with this layer respectively of layer, convolution results are asked L is added with, summed result_CJ-th of offset parameter of layerIt handles, obtains using ReLU activation primitiveIt calculates public Formula is as follows：

In above formula,For l_CI-th of input feature vector figure convolution kernel corresponding with j-th of output characteristic pattern of layer, nc are The input feature vector figure number of convolutional layer,Indicate l_CI-th of input feature vector figure of layer, while being also l_C- 1 layer of i-th of output Characteristic pattern, * representing matrix convolution, if l_C=C1, then l_C- 1 layer is input layer.

Successively calculate l_CAll output characteristic pattern of layer, obtainsWithValue update a_C'(l_C=C, such as work as l_C= When C1, then a is used^C1Update a_C1'), l is updated to l+1, network type is judged for simultaneously return step 1-5-1-1, carries out next The operation of network layer；

Step 1-5-1-3 handles down-sampling layer：There is l=l at this time_S,l_S∈ { S1, S2, S3, S4 }, step 1-5-1-2 is obtained The output characteristic pattern of the convolutional layer arrived respectively withPhase convolution, then sampled with step-length for 2, sampling obtains l_S The output characteristic pattern of layerCalculation formula is as follows：

In above formula, Sample () indicates that step-length is 2 sampling processing, l_S- 1 indicates the previous convolution of current down-sampling layer Layer,Indicate l_SThe output characteristic pattern of layerIn j-th of output characteristic pattern, obtain l_SThe output characteristic pattern of layerAfterwards, more by l It is newly l+1, and return step 1-5-1-1 judges network type, carries out the operation of next network layer；

Step 1-5-1-4 calculates F1 layers of probability vector：There is l=F1 at this time, by matrixing, 32 width of C5 are differentiated The output characteristic pattern that rate is 4 × 4 obtains the output feature vector a for the F1 layer that resolution ratio is 512 × 1 with column sequential deployment^F1, point Horizontal weighting parameter matrix W H and a are not calculated^F1Apposition, vertical weighting parameter matrix W V and a^F1Apposition, by calculated result point It does not sum with Horizontal offset parameter BH, vertical off setting parameter BV, obtains level probability vector HPV after the processing of Softmax function With vertical probability vector VPV, specific formula for calculation is as follows：

By its vertical probability vector VPV transposition, final vertical probability vector is obtained；

Step 1-5-2 includes the following steps：

Step 1-5-2-1 predicts DC1 layers of vertical direction：By last width input picture of input layer and vertical probability to VPV phase convolution is measured, the DC1 layer that resolution ratio is 240 × 280 is obtained and exports characteristic pattern a^DC1；

Step 1-5-2-2 predicts DC2 layers of vertical direction：By DC1 layers of output characteristic pattern a^DC1With level probability vector HPV phase Convolution, obtains the output forecast image of propagated forward, and resolution ratio is 240 × 240.

Step 1-6 includes the following steps：

Step 1-6-1 calculates PPL error term：It will be in the step 1-5-2-2 forecast image obtained and the training sample of input Control label ask poor, calculate DC2 layers, DC1 layers of error term, finally acquire the error term δ of level probability vector^HPVWith it is vertical The error term δ of probability vector^VPV；

Step 1-6-2 calculates RDSN error term：According to the error term δ of level probability vector^HPVWith vertical probability vector Error term δ^VPV, from it is rear to preceding successively calculate classification layer F1, convolutional layer (C5, C4, C3, C2, C1) hidden layer (H5, H4, H3, H2, H1) and the error term of down-sampling layer (S4, S3, S2, S1), the output of the resolution ratio of any layer error term matrix acquired and this layer The resolution ratio of characteristic pattern is consistent；

Step 1-6-3 calculates gradient：The mistake of each network layer in RDSN is calculated according to the error term that step 1-6-2 is obtained Gradient value of the poor item to this layer of weighting parameter and offset parameter；

Step 1-6-4, undated parameter：By the ladder of the weighting parameter of the step 1-6-3 each network layer obtained and offset parameter Angle value is multiplied by the learning rate of RDCNN, obtains the update item of each network layer weighting parameter and offset parameter, by former weighting parameter and partially It sets parameter and asks poor with the update item respectively, obtain updated weighting parameter and offset parameter.

Step 1-6-1 includes the following steps：

Step 1-6-1-1 calculates dynamic convolutional layer DC2 error term：The forecast image and the group that step 1-5-2-2 is obtained The control label of sample asks poor, obtains the error term matrix delta that size is 240 × 240^DC2；

Step 1-6-1-2 calculates dynamic convolutional layer DC1 error term：By zero padding by DC2 layers of error term matrix delta^DC2 Expanding is 240 × 320, by level probability Vector rotation 180 degree, by the error term matrix after expansion and the level probability after overturning Vector phase convolution obtains DC1 layers of error term matrix delta^DC1, size is 240 × 280, and formula is as follows：

δ^DC1=Expand_Zero (δ^DC2) * rot180 (HPV),

Step 1-6-1-3 calculates probability vector error term：The error term for calculating level probability vector HPV, by DC1 layers Export characteristic pattern and error term matrix delta^DC2Phase convolution obtains 1 × 41 row vector after convolution, which is the error term of HPV δ^HPV, formula is as follows：

δ^HPV=a^DC1*δ^DC2,

Step 1-6-2 includes the following steps：

Step 1-6-2-1 calculates classifier layer F1 error term：By the error term of the step 1-6-1-3 probability vector obtained δ^VPVAnd δ^HPVWeighting parameter matrix W V vertical with F1 layers and horizontal weighting parameter matrix W H carry out matrix multiple respectively, then by square The apposition of battle array is summed and is averaged, and F1 layers of error term δ is obtained^F1, formula is as follows：

Step 1-6-2-2 calculates convolutional layer C5 error term：By matrixing, the F1 layer that will be obtained in step 1-6-2-1 Error term δ^F1It is transformed to the matrix that 32 resolution ratio are 4 × 4Obtain C5 layers of error term δ^C5,Table Show that transformed 32nd resolution ratio is 4 × 4 matrix；

Step 1-6-2-3, judges network layer type：Indicate the network layer in the RDSN being presently in l, the value of l according to Secondary is { H5, S4, C4, H4, S3, C3, H3, S2, C2, H2, S1, C1, H1 }, and l initial value is H5, judges the type of network layer l, if L ∈ { H5, H4, H3, H2, H1 }, then l is hidden layer, executes step 1-6-2-4；If l ∈ { S4, S3, S2, S1 }, then l is adopted under being Sample layer executes step 1-6-2-5, if l ∈ { C4, C3, C2, C1 }, then l is convolutional layer, executes step 1-6-2-6；

Step 1-6-2-4 calculates hidden layer error term：L=l at this time_H,l_H∈ { H5, H4, H3, H2, H1 } calculates l_HLayer I-th of error term matrixBy zero padding respectively by l+1 layers (convolutional layer) of each error term matrix delta^l+1It expands to width For ExpandSize^l+1(ExpandSize^l+1=OutputSize^l+1+2·(KernelSize^l+1- 1)), then by corresponding convolution Core180 degree is rotated, then by the matrix after expansion and the convolution nuclear phase convolution after overturning, and convolution results is summed, is obtained l_HI-th of error term matrix of layerFormula is as follows：

In above formula, nc indicates the error term number of l+1 layers (convolutional layer), numerical value and l+1 layers of output characteristic pattern quantity It is identical, and have nc=OutputMaps^l+1。

All error term matrixes are successively calculated, l is obtained_HThe output characteristic pattern of layerL is updated to l-1, and returns to step Rapid 1-6-2-3 judges network type, calculates the error term of a upper network layer；

Step 1-6-2-5 calculates down-sampling layer error term：L=l at this time_S,l_S∈ { S4, S3, S2, S1 } calculates l_SLayer I-th of error term matrixBy zero padding respectively by each error term matrix delta of l+2 layers (corresponding convolutional layer)^l+2It expands extremely Width is ExpandSize^l+2(ExpandSize^l+2=OutputSize^l+2+2·(KernelSize^l+2It -1)), then will be corresponding Convolution kernel180 degree is rotated, then by the matrix after expansion and the convolution nuclear phase convolution after overturning, and convolution results are summed, Obtain l_SI-th of error term matrix of layerFormula is as follows：

In above formula, nc indicates the error term number of l+2 layers (convolutional layer), numerical value and l+2 layers of output characteristic pattern quantity It is identical, and have nc=OutputMaps^l+2。

All error term matrixes are successively calculated, l is obtained_SThe output characteristic pattern of layerL is updated to l-1, and returns to step Rapid 1-6-2-3 judges network type, calculates the error term of a upper network layer；

Step 1-6-2-6 calculates convolutional layer error term：There is l=l at this time_C,l_C∈ { C4, C3, C2, C1 }, due to step 1- The initial value of l is H5 in 6-2-3, therefore is not in l_CThe case where=C5, for l_CI-th of error term matrix of layer First to corresponding i-th of the error term matrix in l+1 layers (down-sampling layer)It is up-sampled, as shown in fig. 6, for 2 × 2 The process schematic that is up-sampled of matrix, when up-sampling, willIn each element error entry value average mark to sample region In domain, obtaining resolution ratio isUp-sampling matrix, then calculate activation primitive in l_CLayer is corresponding The inner product of derivative and the up-sampling matrix acquired at characteristic pattern, obtains l_CI-th of error term matrix of layerThe following institute of formula Show：

Step 1-6-2-7, l layers are convolutional layer, i.e. l=l at this time_C, it is divided into two kinds of situations later：

If l ≠ C1, l is updated to l-1, and return step 1-6-2-3 judges network type, calculates a upper network layer Error term；

If l=C1, the calculating of step 1-6-2 sub-network error term terminates；

Step 1-6-3 includes the following steps：

Step 1-6-3-1 calculates convolutional layer error term to the gradient of convolution kernel：Use l_CIndicate currently processed convolutional layer, l_C ∈ { C1, C2, C3, C4, C5 } successively calculates each convolutional layer error term to the gradient of convolution kernel, by convolutional layer since C1 layers I-th of input feature vector figureWith l_CJ-th of error term matrix of layerPhase convolution, convolution results are the ladder of corresponding convolution kernel Angle valueFormula is as follows：

Step 1-6-3-2 calculates each convolutional layer error term to the gradient of biasing：Use l_CIndicate currently processed convolutional layer, l_C ∈ { C1, C2, C3, C4, C5 } successively calculates each convolutional layer error term to the gradient of biasing, by l since C1 layers_CJ-th of layer Error term matrixIn all elements sum, obtain j-th of this layer biasing gradient valueThe following institute of formula Show：

In above formula, Sum () expression sums to all elements of matrix；

Step 1-6-3-3 calculates hidden layer error term to the gradient of convolution kernel：Use l_HIndicate currently processed hidden layer, l_H ∈ { H1, H2, H3, H4, H5 } successively calculates each convolutional layer error term to the gradient of convolution kernel, first to implicit since H1 layers Layer error term is cut, and is retained central width and isPart (work as l_HWhen=H5, protect Staying H5 layers of error term center width is 4 × 4 part) it is denoted asThen by i-th of input feature vector figure of hidden layerWith J-th of component phase convolution, convolution results are the gradient value of corresponding convolution kernelFormula is as follows：

Step 1-6-3-4 calculates each hidden layer error term to the gradient of biasing：Use l_HIndicate currently processed hidden layer, l_H ∈ { H1, H2, H3, H4, H5 } successively calculates each convolutional layer error term to the gradient of biasing, by step 1-6-3-3 since H1 layers Obtained inAll elements in j-th of component are summed, and the gradient value of j-th of this layer biasing is obtainedFormula As follows：

In above formula, Sum () expression sums to all elements of matrix；

Step 1-6-3-5 calculates F1 layers of error term to the gradient of weighting parameter：Calculate separately level probability vector with it is vertical The error term δ of probability vector^HPV、δ^VPVWith F1 layers of error term δ^F1Inner product, calculated result be F1 layers of error term to weighting parameter WH, The gradient value of WV, formula are as follows：

▽ WH=(δ^HPV)^T×(δ^F1)^T,

▽ WV=δ^VPV×(δ^F1)^T,

Step 1-6-3-6 calculates F1 layers of error term to the gradient of offset parameter：By level probability vector and vertical probability to The error term δ of amount^HPV、δ^VPVRespectively as F1 layers of error term to the gradient value of Horizontal offset parameter BH and vertical off setting parameter BV, Formula is as follows：

▽ BH=(δ^HPV)^T,

▽ BV=δ^VPV,

Step 1-6-4 includes the following steps：

Step 1-6-4-1 updates each convolutional layer weighting parameter：Each convolutional layer error term pair that step 1-6-3-1 is obtained The gradient of convolution kernel is multiplied by the learning rate of RDCNN, obtains the correction term of convolution kernel, then former convolution kernel and the correction term are asked poor, The convolution kernel updatedFormula is as follows：

In above formula, λ is the e-learning rate determined in step 1-3, λ=0.0001；

Step 1-6-4-2 updates each convolutional layer offset parameter：Each convolutional layer error term pair that step 1-6-3-2 is obtained The gradient of biasing is multiplied by the learning rate of RDCNN, obtains the correction term of offset parameter, then former bias term and the correction term are asked poor, The bias term updatedFormula is as follows：

Step 1-6-4-3 updates each hidden layer weighting parameter：Each hidden layer error term pair that step 1-6-3-3 is obtained The gradient of convolution kernel is multiplied by the learning rate of RDCNN, obtains the correction term of convolution kernel, then former convolution kernel and the correction term are asked poor, The convolution kernel updatedFormula is as follows：

Step 1-6-4-4 updates each hidden layer offset parameter：Each hidden layer error term pair that step 1-6-3-4 is obtained The gradient of biasing is multiplied by the learning rate of RDCNN, obtains the correction term of offset parameter, then former bias term and the correction term are asked poor, The bias term updatedFormula is as follows：

Step 1-6-4-5 updates F1 layers of weighting parameter：The F1 layer error term that step 1-6-3-5 is obtained is to weighting parameter The gradient value of WH and WV is multiplied by the learning rate of RDCNN, obtains the correction term of weighting parameter, then former weighting parameter WH and WV is distinguished Ask poor with the correction term acquired, the WH and WV updated, formula is as follows：

WH=WH- λ ▽ WH,

WV=WV- λ ▽ WV；

Step 1-6-4-6 updates F1 layers of offset parameter：The F1 layer error term that step 1-6-3-6 is obtained is to offset parameter The gradient value of BH and BV is multiplied by the learning rate of RDCNN, obtains the correction term of offset parameter, then former offset parameter BH and BV is distinguished Ask poor with the correction term acquired, the BH and BV updated, formula is as follows：

BH=BH- λ ▽ BH,

BV=BV- λ ▽ BV.

Step 2 includes the following steps：

Step 2-1, data prediction：Input test image set, the every piece image concentrated to test image standardize Change processing, converts every piece image to 280 × 280 gray level image, then divide to gray level image set, construction includes The test sample collection of TestsetSize group sample；

Step 2-2, read test sample：The TestsetSize group test sample input that step 2-1 is obtained is by training Circulation dynamic convolutional neural networks in；

Step 2-3, propagated forward：The image sequence characteristic for extracting input in a sub-network obtains level probability vector HPV_testWith vertical probability vector VPV_test；In probabilistic forecasting layer, by the last piece image in input image sequence successively with VPV_test、HPV_testPhase convolution obtains the final extrapolated image of circulation dynamic convolutional neural networks.

Step 2-1 includes the following steps：

Step 2-1-1, sampling：The image that test image is concentrated is sequentially arranged, and constant duration is distributed, when Between between be divided into 6 minutes, altogether include N_TestWidth image determines TestsetSize by following formula：

If Mod (N_Test, 4)=0

If Mod (N_Test, 4) ≠ 0

After acquiring TestsetSize, test image is retained by sampling and concentrates preceding 4 × TestsetSize+1 width image, is adopted Last image is concentrated to meet the requirements amount of images by deleting test image when sample；

Step 2-1-2, image normalization：Image transformation, normalization operation, by original point are carried out to the image that sampling obtains The color image that resolution is 2000 × 2000 is converted into the gray level image that resolution ratio is 280 × 280；

Step 2-1-3 constructs test sample collection：Test sample collection is constructed using the grayscale image image set that step 2-1-2 is obtained, Gray level image is concentrated into every four adjacent images, i.e., { 4M+1,4M+2,4M+3,4M+4 } width image is as one group of input sequence Column, [4 × (M+1)+1] width image is by cutting, and the part that the central resolution ratio of reservation is 240 × 240 is as corresponding sample Label is compareed, wherein for positive integer, and there is M ∈ [0, TestsetSize-1] to obtain comprising TestsetSize group test sample Test sample collection；

Step 2-1-2 includes the following steps：

Step 2-1-2-1, image conversion：Gray level image is converted by colored echo strength CAPPI image, then passes through sanction It cuts and retains the part that original image center resolution ratio is 560 × 560, by the image resolution ratio boil down to 280 × 280 after cutting, Obtain the grayscale image that resolution ratio is 280 × 280；

Step 2-1-2-2, data normalization：By each of the grayscale image obtained in step 1-1-2-1 pixel Value is mapped to [0~1] from [0~255]；

Step 2-3 includes the following steps：

Step 2-3-1 calculates sub-network probability vector：Pass through the alternate treatment of convolutional layer and down-sampling layer in a sub-network The image sequence characteristic for extracting input, is then handled by Softmax function in classifier layer, obtains level probability vector HPV_testWith vertical probability vector VPV_test；

Step 2-3-2 calculates probabilistic forecasting layer and exports image：The VPV that step 2-3-1 is obtained_testAnd HPV_testAs probability The convolution kernel of prediction interval, by the last piece image in input image sequence successively with VPV_testAnd HPV_testPhase convolution, is followed The final extrapolated image of gyration state convolutional neural networks；

Step 2-3-1 includes the following steps：

Step 2-3-1-1, judges network layer type：Indicate that the network layer in current RDSN, the value of p are followed successively by with p { H1, C1, S1, H2, C2, S2, H3, C3, S3, H4, C4, S4, H5, C5, F1 }, initial value H1.Judge the class of network layer p Type, if p ∈ { H1, H2, H3, H4, H5 }, then p is hidden layer, executes step 2-3-1-2；If p { C1, C2, C3, C4, C5 }, then p For convolutional layer, step 2-3-1-3 is executed；If p ∈ { S1, S2, S3, S4 }, then p is down-sampling layer, executes step 2-3-1-4；If p =F1, then p is classifier layer, executes step 2-3-1-5.This output characteristic pattern tested is denoted as a in test process_C", Middle C ∈ { C1, C2, C3, C4, C5 }, a_C" initial value be null matrix；

Step 2-3-1-2 handles hidden layer：There is p=p at this time_H,p_H∈ { H1, H2, H3, H4, H5 } is divided into two kinds of feelings at this time Condition：

Work as p_HWhen ∈ { H1, H2, H3, H4 }, calculating p first_HV-th of output characteristic pattern of layerPass through zero pixel filling By a_C" in corresponding characteristic pattern (if p_H=H1, then C=C1) width expand toIt is again that it is corresponding with this layer Convolution nuclear phase convolution, convolution results are summed, and summed result adds p_HV-th of offset parameter of layerIt is activated by ReLU Function processing, obtainsCalculation formula is as follows：

In above formula, Expand_Zero () indicates zero extended function,For p_HU-th of the input feature vector figure and v of layer The corresponding convolution kernel of a output characteristic pattern, mh are the input feature vector figure number of current hidden layer,Indicate p_HU-th of layer Input feature vector figure,Value determine and have by the width of input feature vector figure and the size of convolution kernel

Work as p_HWhen=H5, H5 layers of v-th of output characteristic pattern is calculated firstBy zero pixel filling by a_C5" feature Figure resolution ratio is expanded to 10 × 10, then the corresponding weighting parameter of itself and this layer is multiplied, and calculated result is summed, summed result adds V-th of offset parameter of upper H5 layerIt handles, obtains by ReLU activation primitiveCalculation formula is as follows：

In above formula,For H5 layers of u-th of input feature vector figure weighting parameter corresponding with v-th of output characteristic pattern；

Successively calculate p_HAll output characteristic pattern of layer, obtainsP is updated to l+1, and return step 2-3-1-1 sentences Circuit network type carries out the operation of next network layer；

Step 2-3-1-3 handles convolutional layer：There is p=p at this time_C,p_C∈ { C1, C2, C3, C4, C5 }, first calculating p_CLayer V-th of output characteristic patternBy p_CThe input feature vector figure convolution nuclear phase convolution corresponding with this layer respectively of layer, convolution results are asked P is added with, summed result_CV-th of offset parameter of layerIt handles, obtains using ReLU activation primitiveIt calculates public Formula is as follows：

In above formula,For p_CU-th of input feature vector figure convolution kernel corresponding with v-th of output characteristic pattern of layer, mc are The input feature vector figure number of convolutional layer,Indicate p_CU-th of input feature vector figure of layer, while being also p_C- 1 layer of u-th of output Characteristic pattern, * representing matrix convolution, if p_C=C1, then p_C- 1 layer is input layer.

Successively calculate p_CAll output characteristic pattern of layer, obtainsWithValue update a_C”(p_C=C, such as work as p_C= When C1, then a is used^C1Update a_C1"), p is updated to p+1, network type is judged for simultaneously return step 2-3-1-3, carries out next The operation of network layer；

Step 2-3-1-4 handles down-sampling layer：There is p=p at this time_S,p_S∈ { S1, S2, S3, S4 }, step 2-3-1-3 is obtained The output characteristic pattern of the convolutional layer arrived respectively withPhase convolution, then sampled with step-length for 2, sampling obtains p_S The output characteristic pattern of layerCalculation formula is as follows：

Wherein, Sample () indicates that step-length is 2 sampling processing, p_S- 1 indicates the previous convolution of current down-sampling layer Layer,Indicate p_SThe output characteristic pattern of layerIn j-th of output characteristic pattern, obtain p_SThe output characteristic pattern of layerAfterwards, by p It is updated to p+1, and return step 2-3-1-1 judges network type, carries out the operation of next network layer；

Step 2-3-1-4 calculates F1 layers of probability vector：If network layer p is classifier layer, i.e. p=F1 is become by matrix Change, by the 32 width resolution ratio of C5 be 4 × 4 output characteristic pattern with column sequential deployment, obtain the F1 layer that resolution ratio is 512 × 1 Export feature vectorThen calculate separately horizontal parameters matrix W H, Vertical Parameters matrix W V withApposition, by calculate tie Fruit is summed with Horizontal offset parameter BH, vertical off setting parameter BV respectively, and summed result obtains level after the processing of Softmax function Probability vector HPV_test, vertical probability vector VPV_test, calculation formula is as follows：

By its vertical probability vector VPV_testTransposition obtains final vertical probability vector；

Step 2-3-2 includes the following steps：

Step 2-3-2-1 predicts DC1 layers of vertical direction：By last width input picture of input layer and vertical probability to Measure VPV_testPhase convolution obtains the DC1 layer that resolution ratio is 240 × 280 and exports characteristic pattern

Step 2-3-2-2 predicts DC2 layers of vertical direction：Step 2-3-2-1 is obtainedWith level probability vector HPV_testPhase convolution, obtains the final extrapolated image of RDCNN, and resolution ratio is 240 × 240.

Claims

1. the Radar Echo Extrapolation model training method based on circulation dynamic convolutional neural networks, which is characterized in that including following Step：

Step 1, data prediction：Training image collection is inputted, standardization processing is carried out to every piece image that training image is concentrated, It converts every piece image to 280 × 280 gray level image, obtains image collection, gray level image set is divided, construct Training sample set comprising TrainsetSize group sample；

Step 2, RDCNN is initialized：RDCNN structure is designed, the circulation dynamic sub-network subnet of generating probability vector is configured to Network RDSN and probabilistic forecasting layer PPL for predicting future time instance radar return, provides the initial of RDCNN for off-line training step Change model；

Step 3, the training parameter of RDCNN is initialized：Enable e-learning rate λ=0.0001, the sample that the training stage inputs every time Quantity BatchSize=10, most large quantities of frequency of training of training sample setCurrent batch of training Number BatchNum=1, the maximum number of iterations IterationMax=40 of network training, current iteration number IterationNum=1；

Step 4, training sample is read：By the way of batch training, the training sample that training is obtained from step 1 every time, which is concentrated, to be read BatchSize group training sample, every group of training sample include 5 width image { x₁,x₂,x₃,x₄, y }, wherein { x₁,x₂,x₃,x₄Conduct Input image sequence, y are corresponding control label；

Step 5, propagated forward：The feature that input image sequence is extracted in RDSN obtains level probability vector HPV and vertical general Rate vector VPV；In probabilistic forecasting layer, by the last piece image in input image sequence successively with VPV, HPV phase convolution, obtain To the output forecast image of propagated forward；

Step 6, backpropagation：The error term that probability vector is acquired in PPL, further according to probability vector error term from rear to preceding The error term of each network layer in RDSN is successively calculated, and then calculates error term in each network layer and weighting parameter and biasing is joined Several gradients utilizes obtained gradient updating network parameter；

Step 7, off-line training step controls：Whole control is carried out to the offline neural metwork training stage, is divided into following three kinds of feelings Condition：

If training sample is concentrated there are still original training sample, i.e. BatchNum < BatchMax, then return step 4 after It resumes studies and takes BatchSize group training sample, carry out network training；

If training sample, which is concentrated, is not present original training sample, i.e. BatchNum=BatchMax, and current network changes Generation number is less than maximum number of iterations, i.e. IterationNum < IterationMax then enables BatchNum=1, return step 4 Continue to read BatchSize group training sample, carries out network training；

If training sample, which is concentrated, is not present original training sample, i.e. BatchNum=BatchMax, and network iteration time Number reaches maximum number of iterations, i.e. IterationNum=IterationMax, then terminates RDCNN off-line training step, obtain Convergent RDCNN model.

2. the method according to claim 1, wherein step 1 includes the following steps：

Step 1-1, sampling：Training image collection is inputted, the image that training image is concentrated is sequentially arranged, and between waiting the times Every distribution, time interval is 6 minutes, altogether includes N_TrainWidth image determines TrainsetSize by following formula：

Step 1-2, normalized images：Image transformation is carried out to the image that sampling obtains, original resolution is by normalization operation 2000 × 2000 color image is converted into the gray level image that resolution ratio is 280 × 280；

Step 1-3 constructs training sample set：Training sample set is constructed using the gray level image that step 1-2 is obtained, by gray level image Every four adjacent images are concentrated, i.e., { 4N+1,4N+2,4N+3,4N+4 } width image is as one group of list entries, [4 × (N + 1)+1] width image is by cutting, and control label of the part that the central resolution ratio of reservation is 240 × 240 as corresponding sample is right In N group sampleIts make is as follows：

3. according to the method described in claim 2, it is characterized in that, step 1-2 includes the following steps：

Step 1-2-1, image conversion：Gray level image is converted by the step 1-1 image sampled, it is original by cutting reservation Image resolution ratio boil down to 280 × 280 after cutting is obtained resolution ratio by the part that image center resolution ratio is 560 × 560 For 280 × 280 grayscale image；

Step 1-2-2, data normalization：By the value of each of the grayscale image obtained in step 1-2-1 pixel from [0~ 255] it is mapped to [0~1].

4. according to the method described in claim 3, it is characterized in that, step 2 includes the following steps：

Step 2-1, construction circulation dynamic sub-network RDSN：

Sub-network is made of 10 network layers, is followed successively by convolutional layer C1, down-sampling layer S1, hidden layer H1, convolutional layer from front to back C2, it down-sampling layer S2, hidden layer H2, convolutional layer C3, down-sampling layer S3, hidden layer H3, convolutional layer C4, down-sampling layer S4, implies Layer H4, convolutional layer C5, hidden layer H5 and classifier layer F1；

Step 2-2 constructs probabilistic forecasting layer PPL：

Dynamic convolutional layer DC1 and dynamic convolutional layer DC2 is constructed in probabilistic forecasting layer, the vertical probability vector VPV that RDSN is exported As the convolution kernel of dynamic convolutional layer DC1, convolution kernel of the level probability vector HPV as dynamic convolutional layer DC2.

5. according to the method described in claim 4, it is characterized in that, step 2-1 includes the following steps：

Step 2-1-1 constructs convolutional layer：For convolutional layer l_C,l_C∈ { C1, C2, C3, C4, C5 }, determines the following contents：Convolutional layer Output characteristic pattern quantityConvolution kernelAnd offset parameterFor convolution kernel, it is thus necessary to determine that volume The width of product coreThe quantity of convolution kernelThe value is that convolutional layer inputs and exports characteristic pattern number The product of amount, and convolution kernel is constructed according to Xavier initial method；For offset parameter, the output characteristic pattern of quantity and this layer Quantity is identical；l_CLayer output characteristic pattern width be Value by convolutional layer l_CInput feature vector figure The width of resolution ratio and convolution kernelIt codetermines, i.e., Indicate convolutional layer l_CUpper one layer of convolutional layer output characteristic pattern width；

For convolutional layer C1, C1 layers of output characteristic pattern quantity OutputMaps is enabled^C1The width of=12, C1 layers of output characteristic pattern OutputSize^C1=272, C1 layers of convolution kernel width KernelSize^C1=9, C1 layers of offset parameter bias^C1It is initialized as zero, C1 layers of convolution kernel k^C1Quantity KernelNumber^C1=48, the initial value of each parameter is in convolution kernelRand () is for generating random number；

Step 2-1-2 constructs hidden layer：For hidden layer l_H,l_H∈ { H1, H2, H3, H4, H5 }, determines the following contents：Hidden layer Output characteristic pattern quantityConvolution kernelAnd offset parameterFor convolution kernel, it is thus necessary to determine that volume The width of product coreThe quantity of convolution kernelIts value is that hidden layer inputs and exports characteristic pattern The product of quantity, and convolution kernel is constructed according to Xavier initial method；For offset parameter, the output of quantity and hidden layer Characteristic pattern quantity is identical；l_HLayer output characteristic pattern width be It is inputted with corresponding convolutional layer special The width for levying figure is consistent；

For hidden layer H1, H1 layers of output characteristic pattern quantity OutputMaps is enabled^H1The width of=4, H1 layers of output characteristic pattern OutputSize^H1=280, H1 layers of convolution kernel width KernelSize^H1=9, H1 layers of offset parameter bias^H1It is initialized as zero, H1 The convolution kernel k of layer^H1Quantity KernelNumber^H1=48, the initial value of each parameter is in convolution kernelRand () is for generating random number；

Step 2-1-3 constructs down-sampling layer：The parameter for needing training is not included in down-sampling layer, by down-sampling layer S1, S2, S3 It is initialized as with the sampling core of S4For down-sampling layer l_S,l_S∈ { S1, S2, S3, S4 } exports feature Figure quantityIt is consistent with the output characteristic pattern quantity of one layer of convolutional layer thereon, exports characteristic pattern widthIt is the 1/2 of the output characteristic pattern width of one layer of convolutional layer thereon, formula is expressed as follows：

Step 2-1-4, structural classification device layer：Classifier layer is made of a full articulamentum F1, and F1 layers of weighting parameter is level Weighting parameter matrix W H and vertical weighting parameter matrix W V, size are 41 × 512, and each of weighting parameter matrix is enabled to join Several initial values areOffset parameter is Horizontal offset parameter BH and vertical off setting parameter BV, initially Turn to 41 × 1 one-dimensional null vector.

6. according to the method described in claim 4, it is characterized in that, step 5 includes the following steps：

Step 5-1, RDSN calculate probability vector：Input is extracted by the alternate treatment of convolutional layer and down-sampling layer in a sub-network Image sequence characteristic, handled in classifier layer by Softmax function, obtain level probability vector HPV and vertical probability Vector VPV；

Step 5-2, PPL export forecast image：Convolution kernel of the HPV and VPV that step 5-1 is obtained as probabilistic forecasting layer, will be defeated Enter the last piece image in image sequence successively with VPV, HPV phase convolution, obtains the output forecast image of propagated forward.

7. according to the method described in claim 5, it is characterized in that, step 5-1 includes the following steps：

Step 5-1-1, judges network layer type：Indicate the network layer in current RDSN with l, the value of l be followed successively by H1, C1, S1, H2, C2, S2, H3, C3, S3, H4, C4, S4, H5, C5, F1 }, initial value H1；The type of network layer l is judged, if l ∈ { H1, H2, H3, H4, H5 }, then l is hidden layer, executes step 5-1-2；If l ∈ { C1, C2, C3, C4, C5 }, then l is convolutional layer, Execute step 1-5-1-3；If l ∈ { S1, S2, S3, S4 }, then l is down-sampling layer, executes step 5-1-4；If l=F1, l are Classifier layer executes step 5-1-5；The output characteristic pattern of this training convolutional layer is denoted as a in training process_C', wherein C ∈ { C1, C2, C3, C4, C5 }, a_C' initial value be null matrix；

Work as l_HWhen ∈ { H1, H2, H3, H4 }, calculating l first_HJ-th of output characteristic pattern of layerIf l_H=H1, then C=C1, leads to Zero passage pixel filling is by a_C' in corresponding characteristic pattern width expand toAgain by the corresponding convolution kernel of itself and this layer Phase convolution, convolution results are summed, and summed result adds l_HJ-th of offset parameter of layerAt ReLU activation primitive Reason, obtains l_HJ-th of output characteristic pattern of layerCalculation formula is as follows：

In above formula, Expand_Zero () indicates zero extended function,For l_HI-th of input feature vector figure of layer and j-th of output The corresponding convolution kernel of characteristic pattern,For l_HJ-th of biasing of layer, nh are the input feature vector figure number of current hidden layer,Indicate l_HI-th of input feature vector figure of layer,Value by input feature vector figure width and convolution kernel it is big Small decision, and have

Work as l_HWhen=H5, H5 layers of j-th of output characteristic pattern is calculated firstBy zero pixel filling by a_C5' characteristic pattern point Resolution is expanded to 10 × 10, then the corresponding weighting parameter of itself and this layer is multiplied, and calculated result is summed, summed result adds H5 J-th of offset parameter of layerIt handles, obtains by ReLU activation primitiveCalculation formula is as follows：

Successively calculate l_HAll output characteristic pattern of layer, obtains l_HThe output characteristic pattern of layerL is updated to l+1, and returns to step Rapid 5-1-1 judges network type, carries out the operation of next network layer；

Step 5-1-3 handles convolutional layer：There is l=l at this time_C,l_C∈ { C1, C2, C3, C4, C5 }, first calculating l_CJ-th of layer Export characteristic patternBy l_CThe input feature vector figure convolution nuclear phase convolution corresponding with this layer respectively of layer, convolution results are summed, are asked L is added with result_CJ-th of offset parameter of layerIt handles, obtains using ReLU activation primitiveCalculation formula is as follows It is shown：

In above formula,For l_CI-th of input feature vector figure convolution kernel corresponding with j-th of output characteristic pattern of layer, nc is convolution The input feature vector figure number of layer,Indicate l_CI-th of input feature vector figure of layer, while being also l_C- 1 layer of i-th of output feature Figure, * representing matrix convolution, if l_C=C1, then l_C- 1 layer is input layer；

Successively calculate l_CAll output characteristic pattern of layer, obtains l_CThe output characteristic pattern of layerWithValue update a_C', more by l Newly it is l+1, judges network type for simultaneously return step 1-5-1-1, carry out the operation of next network layer；

Step 5-1-3 handles down-sampling layer：There is l=l at this time_S,l_S∈ { S1, S2, S3, S4 }, the convolution that step 5-1-2 is obtained Layer output characteristic pattern respectively withPhase convolution, then sampled with step-length for 2, sampling obtains l_SThe output of layer Characteristic patternCalculation formula is as follows：

In above formula, Sample () indicates that step-length is 2 sampling processing, l_S- 1 indicates the previous convolutional layer of current down-sampling layer,Indicate l_SThe output characteristic pattern of layerIn j-th of output characteristic pattern, obtain l_SThe output characteristic pattern of layerAfterwards, l is updated For l+1, and return step 5-1-1 judges network type, carries out the operation of next network layer；

Step 5-1-4 calculates F1 layers of probability vector：Have l=F1 at this time, by matrixing, by the 32 width resolution ratio of C5 be 4 × 4 output characteristic pattern obtains the output feature vector a for the F1 layer that resolution ratio is 512 × 1 with column sequential deployment^F1, calculate separately water Equal rights value parameter matrix W H and a^F1Apposition, vertical weighting parameter matrix W V and a^F1Apposition, by calculated result respectively with level Offset parameter BH, vertical off setting parameter BV summation obtain level probability vector HPV and vertical general after the processing of Softmax function Rate vector VPV, specific formula for calculation are as follows：

8. the method according to the description of claim 7 is characterized in that step 5-2 includes the following steps：

Step 5-2-2 predicts DC2 layers of vertical direction：By DC1 layers of output characteristic pattern a^DC1With level probability vector HPV phase convolution, The output forecast image of propagated forward is obtained, resolution ratio is 240 × 240.

9. according to the method described in claim 8, it is characterized in that, step 6 includes the following steps：

Step 6-1 calculates PPL error term：By in the step 5-2-2 forecast image obtained and the training sample of input to sighting target Label ask poor, calculate the error term of DC2 layers, DC1 layers, finally acquire the error term δ of level probability vector^HPVWith vertical probability vector Error term δ^VPV；

Step 6-2 calculates RDSN error term：According to the error term δ of level probability vector^HPVWith the error term of vertical probability vector δ^VPV, from it is rear to preceding successively calculate classification layer F1, convolutional layer (C5, C4, C3, C2, C1) hidden layer (H5, H4, H3, H2, H1) and under The error term of sample level (S4, S3, S2, S1), the resolution ratio of any layer error term matrix acquired and the output characteristic pattern of this layer Resolution ratio it is consistent；

Step 6-3 calculates gradient：The error term obtained according to step 6-2 calculates the error term of each network layer in RDSN to this The gradient value of layer weighting parameter and offset parameter；

Step 6-4, undated parameter：The step 6-3 weighting parameter of each network layer obtained and the gradient value of offset parameter are multiplied by The learning rate of RDCNN obtains the update item of each network layer weighting parameter and offset parameter, by former weighting parameter and offset parameter point It does not ask poor with the update item, obtains updated weighting parameter and offset parameter.

10. according to the method described in claim 9, it is characterized in that, step 6-1 includes the following steps：

Step 6-1-1 calculates dynamic convolutional layer DC2 error term：Pair of forecast image and this group of sample that step 5-2-2 is obtained Sighting target label ask poor, obtain the error term matrix delta that size is 240 × 240^DC2；

Step 6-1-2 calculates dynamic convolutional layer DC1 error term：By zero padding by DC2 layers of error term matrix delta^DC2Expansion is 240 × 320, by level probability Vector rotation 180 degree, by the error term matrix after expansion and the level probability vector phase after overturning Convolution obtains DC1 layers of error term matrix delta^DC1, size is 240 × 280, and formula is as follows：

δ^DC1=Expand_Zero (δ^DC2) * rot180 (HPV),

In above formula, rot180 () indicates that angle is 180 ° of rotation function, and 2 × 2 matrix zero is extended for 4 × 4 matrix, Matrix after zero expansion, the region that central resolution ratio is 2 × 2 is consistent with original matrix, zero pixel filling of remaining position；

Step 6-1-3 calculates probability vector error term：The error term for calculating level probability vector HPV, DC1 layers of output is special Sign figure and error term matrix delta^DC2Phase convolution obtains 1 × 41 row vector after convolution, which is the error term δ of HPV^HPV, formula It is as follows：

δ^HPV=a^DC1*δ^DC2,

The error term for calculating vertical probability vector VPV, by the input feature vector figure of input layer and error term matrix delta^DC1Phase convolution, volume 41 × 1 column vector is obtained after product, which is the error term δ of VPV^VPV, formula is as follows：

Step 6-2 includes the following steps：

Step 6-2-1 calculates classifier layer F1 error term：By the error term δ of the step 6-1-3 probability vector obtained^VPVAnd δ^HPV Weighting parameter matrix W V vertical with F1 layers and horizontal weighting parameter matrix W H carry out matrix multiple respectively, then by the apposition of matrix It sums and is averaged, obtain F1 layers of error term δ^F1, formula is as follows：

Step 6-2-2 calculates convolutional layer C5 error term：By matrixing, by the error term of the F1 layer obtained in step 6-2-1 δ^F1It is transformed to the matrix that 32 resolution ratio are 4 × 4Obtain C5 layers of error term δ^C5,After indicating transformation The 32nd resolution ratio be 4 × 4 matrix；

Step 6-2-3, judges network layer type：Indicate that the network layer in the RDSN being presently in, the value of l are followed successively by with l { H5, S4, C4, H4, S3, C3, H3, S2, C2, H2, S1, C1, H1 }, l initial value are H5, the type of network layer l are judged, if l ∈ { H5, H4, H3, H2, H1 }, then l is hidden layer, executes step 6-2-4；If l ∈ { S4, S3, S2, S1 }, then l is down-sampling layer, Step 6-2-5 is executed, if l ∈ { C4, C3, C2, C1 }, then l is convolutional layer, executes step 6-2-6；

Step 6-2-4 calculates hidden layer error term：L=l at this time_H,l_H∈ { H5, H4, H3, H2, H1 } calculates l_HI-th of layer Error term matrixBy zero padding respectively by each error term matrix delta of l+1 layers of convolutional layer^l+1It expands to width and is ExpandSize^l+1：

ExpandSize^l+1=OutputSize^l+1+2·(KernelSize^l+1- 1),

Again by corresponding convolution kernel180 degree is rotated, then by the matrix after expansion and the convolution nuclear phase convolution after overturning, and will Convolution results summation, obtains l_HI-th of error term matrix of layerFormula is as follows：

In above formula, nc indicates the error term number of l+1 layers of convolutional layer, and numerical value is identical as l+1 layers of output characteristic pattern quantity, and There is nc=OutputMaps^l+1；

All error term matrixes are successively calculated, l is obtained_HThe output characteristic pattern of layerL is updated to l-1, and return step 6- 2-3 judges network type, calculates the error term of a upper network layer；

Step 6-2-5 calculates down-sampling layer error term：L=l at this time_S,l_S∈ { S4, S3, S2, S1 } calculates l_SI-th of mistake of layer Poor item matrixBy zero padding respectively by each error term matrix delta of l+2 layers of convolutional layer^l+2It expands to width and is ExpandSize^l+2：

ExpandSize^l+2=OutputSize^l+2+2·(KernelSize^l+2- 1),

Again by corresponding convolution kernel180 degree is rotated, then by the matrix after expansion and the convolution nuclear phase convolution after overturning, and Convolution results are summed, l is obtained_SI-th of error term matrix of layerFormula is as follows：

In above formula, nc indicates the error term number of l+2 layers of convolutional layer, and numerical value is identical as l+2 layers of output characteristic pattern quantity, and There is nc=OutputMaps^l+2；

All error term matrixes are successively calculated, l is obtained_SThe output characteristic pattern of layerL is updated to l-1, and return step 6- 2-3 judges network type, calculates the error term of a upper network layer；

Step 6-2-6 calculates convolutional layer error term：There is l=l at this time_C,l_C∈ { C4, C3, C2, C1 }, due to l in step 6-2-3 Initial value be H5, therefore be not in l_CThe case where=C5, for l_CI-th of error term matrix of layerFirst under l+1 layers Corresponding i-th of error term matrix in sample levelIt is up-sampled, it will when up-samplingIn each element error term It is worth average mark into sampling area, obtaining resolution ratio isUp-sampling matrix, then calculate sharp Function living is in l_CThe inner product of derivative and the up-sampling matrix acquired at layer character pair figure, obtains l_CI-th of error term square of layer Battle arrayFormula is as follows：

UpSamlpe () indicates up-sampling function, the corresponding up-sampling area of each of original image pixel after up-sampling Domain in each of original pixel value mean allocation to sampling area pixel, successively calculates all error term matrixes, obtains l_CThe output characteristic pattern of layer

If l ≠ C1, l is updated to l-1, and return step 6-2-3 judges network type, calculates the error of a upper network layer ?；

If l=C1, the calculating of step 6-2 sub-network error term terminates；

Step 6-3 includes the following steps：

Step 6-3-1 calculates convolutional layer error term to the gradient of convolution kernel：Use l_CIndicate currently processed convolutional layer, l_C∈{C1, C2, C3, C4, C5 }, each convolutional layer error term is successively calculated since C1 layers to the gradient of convolution kernel, it is defeated by i-th of convolutional layer Enter characteristic patternWith l_CJ-th of error term matrix of layerPhase convolution, convolution results are the gradient value of corresponding convolution kernelFormula is as follows：

In above formula,WithRespectively indicate l_CThe output characteristic pattern number and l of layer_C- 1 layer Export characteristic pattern number；

Step 6-3-2 calculates each convolutional layer error term to the gradient of biasing：Use l_CIndicate currently processed convolutional layer, l_C∈{C1, C2, C3, C4, C5 }, each convolutional layer error term is successively calculated since C1 layers to the gradient of biasing, by l_CJ-th of error term of layer MatrixIn all elements sum, obtain j-th of this layer biasing gradient valueFormula is as follows：

In above formula, Sum () expression sums to all elements of matrix；

Step 6-3-3 calculates hidden layer error term to the gradient of convolution kernel：Use l_HIndicate currently processed hidden layer, l_H∈{H1, H2, H3, H4, H5 }, each convolutional layer error term is successively calculated since H1 layers to the gradient of convolution kernel, first to hidden layer error Item is cut, and is retained central width and isPart, be denoted asWork as l_HWhen=H5, Retain the part that H5 layers of error term center width are 4 × 4, then by i-th of input feature vector figure of hidden layerWithJth A component phase convolution, convolution results are the gradient value of corresponding convolution kernelFormula is as follows：

In above formula,WithRespectively indicate l_HThe output characteristic pattern number and l of layer_H- 1 layer Output characteristic pattern number；

Step 6-3-4 calculates each hidden layer error term to the gradient of biasing：Use l_HIndicate currently processed hidden layer, l_H∈{H1, H2, H3, H4, H5 }, each convolutional layer error term is successively calculated since H1 layers to the gradient of biasing, it will be obtained in step 6-3-3All elements in j-th of component are summed, and the gradient value of j-th of this layer biasing is obtainedFormula is as follows：

In above formula, Sum () expression sums to all elements of matrix；

Step 6-3-5 calculates F1 layers of error term to the gradient of weighting parameter：Calculate separately level probability vector and vertical probability to The error term δ of amount^HPV、δ^VPVWith F1 layers of error term δ^F1Inner product, calculated result is the ladder of F1 layers of error term to weighting parameter WH, WV Angle value, formula are as follows：

▽ WH=(δ^HPV)^T×(δ^F1)^T,

▽ WV=δ^VPV×(δ^F1)^T,

In above formula, ▽ WH is gradient value of the error term to horizontal weighting parameter, and ▽ WV is ladder of the error term to vertical weighting parameter Angle value；

Step 6-3-6 calculates F1 layers of error term to the gradient of offset parameter：By the mistake of level probability vector and vertical probability vector Poor item δ^HPV、δ^VPVRespectively as F1 layers of error term to the gradient value of Horizontal offset parameter BH and vertical off setting parameter BV, formula is such as Under：

▽ BH=(δ^HPV)^T,

▽ BV=δ^VPV,

In above formula, ▽ BH is gradient value of the error term to Horizontal offset parameter, and ▽ BV is ladder of the error term to vertical off setting parameter Angle value；

Step 6-4 includes the following steps：

Step 6-4-1 updates each convolutional layer weighting parameter：Each convolutional layer error term that step 6-3-1 is obtained is to convolution kernel Gradient is multiplied by the learning rate of RDCNN, obtains the correction term of convolution kernel, then former convolution kernel and the correction term are asked poor, is updated Convolution kernelFormula is as follows：

Step 6-4-2 updates each convolutional layer offset parameter：Ladder of each convolutional layer error term that step 6-3-2 is obtained to biasing Degree is multiplied by the learning rate of RDCNN, obtains the correction term of offset parameter, then former bias term and the correction term are asked poor, is updated Bias termFormula is as follows：

Step 6-4-3 updates each hidden layer weighting parameter：Each hidden layer error term that step 6-3-3 is obtained is to convolution kernel Gradient is multiplied by the learning rate of RDCNN, obtains the correction term of convolution kernel, then former convolution kernel and the correction term are asked poor, is updated Convolution kernelFormula is as follows：

Step 6-4-4 updates each hidden layer offset parameter：Ladder of each hidden layer error term that step 6-3-4 is obtained to biasing Degree is multiplied by the learning rate of RDCNN, obtains the correction term of offset parameter, then former bias term and the correction term are asked poor, is updated Bias termFormula is as follows：

Step 6-4-5 updates F1 layers of weighting parameter：The F1 layer error term that step 6-3-5 is obtained is to weighting parameter WH and WV Gradient value is multiplied by the learning rate of RDCNN, obtain the correction term of weighting parameter, then by former weighting parameter WH and WV respectively with acquire Correction term asks poor, the WH and WV updated, and formula is as follows：

WH=WH- λ ▽ WH,

WV=WV- λ ▽ WV；

Step 6-4-6 updates F1 layers of offset parameter：The F1 layer error term that step 6-3-6 is obtained is to offset parameter BH and BV Gradient value is multiplied by the learning rate of RDCNN, obtain the correction term of offset parameter, then by former offset parameter BH and BV respectively with acquire Correction term asks poor, the BH and BV updated, and formula is as follows：

BH=BH- λ ▽ BH,

BV=BV- λ ▽ BV.