CN108846409A - Radar echo extrapolation model training method based on cyclic dynamic convolution neural network - Google Patents
Radar echo extrapolation model training method based on cyclic dynamic convolution neural network Download PDFInfo
- Publication number
- CN108846409A CN108846409A CN201810402200.8A CN201810402200A CN108846409A CN 108846409 A CN108846409 A CN 108846409A CN 201810402200 A CN201810402200 A CN 201810402200A CN 108846409 A CN108846409 A CN 108846409A
- Authority
- CN
- China
- Prior art keywords
- layer
- layers
- error term
- characteristic pattern
- convolution kernel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a radar echo extrapolation model training method based on a cyclic dynamic convolution neural network, which comprises the following steps: RDCNN offline training: the method comprises the steps of obtaining a training sample set through data preprocessing for a given radar echo image, initializing an RDCNN model, training the RDCNN by using the training sample set, and enabling the RDCNN to be converged through the processes of calculating an output value through network forward propagation and updating network parameters through network backward propagation.
Description
Technical field
The invention belongs to surface weather observation technical fields in Atmospheric Survey, more particularly to based on circulation dynamic convolutional Neural
The Radar Echo Extrapolation model training method of network.
Background technique
Nowcasting refers mainly to the weather forecast of 0~3 hour high-spatial and temporal resolution, and Main Prediction object includes strong drop
The diastrous weathers such as water, strong wind, hail.Currently, many forecast systems all use Numerical Prediction Models, but due to numerical forecast
In the presence of return slow (spin-up) has been forecast, Nowcasting ability is limited.New Generation Doppler Weather Radar has very high
Sensitivity and resolution ratio, the spatial resolution of data information can reach 200~1000m, and temporal resolution can reach 2
~15min.In addition, Doppler radar also has reasonable operating mode, comprehensive condition monitoring and fault warning, advanced
Real-time Calibration System and Radar meteorology product algorithm abundant, the reliability of Nowcasting can be greatly improved.Nowadays,
New Generation Doppler Weather Radar has become one of most effective tool of nowcasting, is faced using Doppler radar
Nearly forecast is based primarily upon Radar Echo Extrapolation technology, i.e., according to current time radar observation result, thus it is speculated that radar return future
Position and intensity, to realize the track prediction to strong convection system.
Traditional Radar Echo Extrapolation method is centroid tracking method and the cross-correlation technique based on maximum correlation coefficient
(Tracking Radar Echoes by Correlation, TREC), but all there is certain deficiency, mass center in conventional method
Tracing is only applicable to echo compared with strong, the lesser storm monomer of range, unreliable for the forecast of a wide range of precipitation;TREC is general
Echo is considered as linear change, and echo variation is increasingly complex in reality, while such method is vulnerable in vector field
Unordered vector disturbance.In addition, existing method is low to the utilization rate of Radar Data, and history Radar Data includes local weather system
The important feature of system variation, has very high researching value.
For improve Radar Echo Extrapolation timeliness, and from a large amount of history Radar Data study radar return variation
Machine learning method is introduced into Radar Echo Extrapolation by rule.Convolutional neural networks (Convolutional Neural
Networks, CNNs) important branch as deep learning, it is widely used in image procossing, the fields such as pattern-recognition.The network
Maximum feature is using part connection, weight is shared, down-sampling method, deformation, translation and overturning to input picture
With stronger adaptability.For strong temporal correlation existing between radar return image, it is dynamic to design the circulation based on input
State convolutional neural networks, which can dynamically change weighting parameter according to the radar echo map of input, and then predict extrapolation
Image.Using history Radar Data training circulation dynamic convolutional neural networks, so that network is more fully extracted echo character, learn
Echo changing rule is practised, for improving Radar Echo Extrapolation accuracy, optimization nowcasting effect is of great significance.
Summary of the invention
Goal of the invention:When the technical problem to be solved by the present invention is to be directed to the extrapolation of existing Radar Echo Extrapolation method
Imitate it is short, it is insufficient to Radar Data utilization rate, propose a kind of radar return based on circulation dynamic convolutional neural networks (RDCNN)
Extrapolation method is realized and shows CAPPI (Constant AltitudePlan Position to the high plane such as radar echo intensity
Indicator, CAPPI) image outside forecast, include the following steps:
Step 1, data prediction:Training image collection is inputted, the every piece image concentrated to training image standardizes
Processing, converts every piece image to 280 × 280 gray level image, obtains image collection, draw to gray level image set
Point, construction includes the training sample set of TrainsetSize group sample;
Step 2, RDCNN is initialized:RDCNN structure is designed, the circulation dynamic sub-network of generating probability vector is configured to
Sub-network RDSN and probabilistic forecasting layer PPL for predicting future time instance radar return, provides RDCNN's for off-line training step
Initialization model;
Step 3, the training parameter of RDCNN is initialized:E-learning rate λ=0.0001 is enabled, what the training stage inputted every time
Sample size BatchSize=10, most large quantities of frequency of training of training sample setCurrently
Criticize frequency of training BatchNum=1, the maximum number of iterations IterationMax=40 of network training, current iteration number
IterationNum=1;
Step 4, training sample is read:By the way of batch training, the training sample that training is obtained from step 1 every time is concentrated
BatchSize group training sample is read, every group of training sample includes 5 width image { x1,x2,x3,x4, y }, wherein { x1,x2,x3,x4}
As input image sequence, y is corresponding control label;
Step 5, propagated forward:The feature that input image sequence is extracted in RDSN obtains level probability vector HPV and hangs down
Straight probability vector VPV;In probabilistic forecasting layer, the last piece image in input image sequence is successively rolled up with VPV, HPV phase
Product, obtains the output forecast image of propagated forward;
Step 6, backpropagation:The error term that probability vector is acquired in PPL, further according to probability vector error term from rear
To the preceding layer-by-layer error term for calculating each network layer in RDSN, and then calculate in each network layer error term to weighting parameter and partially
The gradient for setting parameter utilizes obtained gradient updating network parameter;
Step 7, off-line training step controls:Whole control is carried out to the offline neural metwork training stage, is divided into following three
Kind situation:
If training sample is concentrated there are still original training sample, i.e. BatchNum < BatchMax, then step is returned
Rapid 4 continue to read BatchSize group training sample, carry out network training;
If training sample, which is concentrated, is not present original training sample, i.e. BatchNum=BatchMax, and current net
Network the number of iterations is less than maximum number of iterations, i.e. IterationNum < IterationMax then enables BatchNum=1, returns
Step 4 continues to read BatchSize group training sample, carries out network training;
If training sample, which is concentrated, is not present original training sample, i.e. BatchNum=BatchMax, and network changes
Generation number reaches maximum number of iterations, i.e. IterationNum=IterationMax, then terminates RDCNN off-line training step,
Obtain convergent RDCNN model.
Step 1 includes the following steps:
Step 1-1, sampling:Training image collection is inputted, the image that training image is concentrated is sequentially arranged, and whens waiting
Between be spaced apart, time interval be 6 minutes, altogether include NTrainWidth image determines TrainsetSize by following formula:
Wherein, Mod (NTrain, 4) and indicate NTrainTo 4 modulus,Expression is not more thanMaximum integer, acquire
After TrainsetSize, training image is retained by sampling and concentrates preceding 4 × TrainsetSize+1 width image, when sampling, which passes through, deletes
Except training image concentrates last image to meet the requirements amount of images;
Step 1-2, normalized images:Image transformation, normalization operation, by original resolution are carried out to the image that sampling obtains
The color image that rate is 2000 × 2000 is converted into the gray level image that resolution ratio is 280 × 280;
Step 1-3 constructs training sample set:Training sample set is constructed using the gray level image that step 1-2 is obtained, by gray scale
Every four adjacent images in image set, i.e. { 4N+1,4N+2,4N+3,4N+4 } width image are as one group of list entries, and the [the 4th
× (N+1)+1] width image is by cutting, and the part that the central resolution ratio of reservation is 240 × 240 is as corresponding sample to sighting target
Label, for N group sampleIts make is as follows:
In above formula, G4N+1Indicate gray level image concentrate 4N+1 width image, N is positive integer, and have N ∈ [0,
TrainsetSize-1], Crop () indicates trimming operation, and the portion that original image center size is 240 × 240 is retained after cutting
Point, finally obtain the training sample set comprising TrainsetSize group training sample.
Step 1-2 includes the following steps:
Step 1-2-1, image conversion:Gray level image is converted by the step 1-1 image sampled, is retained by cutting
The image resolution ratio boil down to 280 × 280 after cutting is divided in the part that original image center resolution ratio is 560 × 560
The grayscale image that resolution is 280 × 280;
Step 1-2-2, data normalization:By the value of each of the grayscale image obtained in step 1-2-1 pixel from
[0~255] is mapped to [0~1].
Step 2 includes the following steps:
Step 2-1, construction circulation dynamic sub-network RDSN:
Sub-network is made of 10 network layers, is followed successively by convolutional layer C1, down-sampling layer S1, hidden layer H1, volume from front to back
Lamination C2, down-sampling layer S2, hidden layer H2, convolutional layer C3, down-sampling layer S3, hidden layer H3, convolutional layer C4, down-sampling layer S4,
Hidden layer H4, convolutional layer C5, hidden layer H5 and classifier layer F1;
Step 2-2 constructs probabilistic forecasting layer PPL:
Dynamic convolutional layer DC1 and dynamic convolutional layer DC2 is constructed in probabilistic forecasting layer, the vertical probability vector that RDSN is exported
Convolution kernel of the VPV as dynamic convolutional layer DC1, convolution kernel of the level probability vector HPV as dynamic convolutional layer DC2.
Step 2-1 includes the following steps:
Step 2-1-1 constructs convolutional layer:For convolutional layer lC,lC∈ { C1, C2, C3, C4, C5 }, determines the following contents:Volume
The output characteristic pattern quantity of laminationConvolution kernelAnd offset parameterFor convolution kernel, it is thus necessary to determine that volume
The width of product coreThe quantity of convolution kernelThe value is that convolutional layer inputs and exports characteristic pattern quantity
Product, and according to Xavier initial method construct convolution kernel;For offset parameter, the output characteristic pattern number of quantity and this layer
It measures identical;lCLayer output characteristic pattern width be Value by convolutional layer lCInput feature vector figure point
The width of resolution and convolution kernelIt codetermines, i.e., Indicate convolutional layer lCUpper one layer of convolutional layer output characteristic pattern width;
For convolutional layer C1, C1 layers of output characteristic pattern quantity OutputMaps is enabledC1The width of=12, C1 layers of output characteristic pattern
Spend OutputSizeC1=272, C1 layers of convolution kernel width KernelSizeC1=9, C1 layers of offset parameter biasC1It is initialized as
Zero, C1 layers of convolution kernel kC1Quantity KernelNumberC1=48, the initial value of each parameter is in convolution kernelRand () is for generating random number;
For convolutional layer C2, C2 layers of output characteristic pattern quantity OutputMaps are enabledC2The width of=32, C2 layers of output characteristic pattern
OutputSizeC2=128, C2 layers of convolution kernel width KernelSizeC2=9, C2 layers of offset parameter are initialized as zero, C2 layers
Convolution kernel kC2Quantity KernelNumberC2=384, the initial value of each parameter is in convolution kernel
For convolutional layer C3, C3 layers of output characteristic pattern quantity OutputMaps are enabledC3The width of=32, C3 layers of output characteristic pattern
OutputSizeC3=56, C3 layers of convolution kernel width KernelSizeC3=9, C3 layers of offset parameter are initialized as zero, C3 layers
Convolution kernel kC3Quantity KernelNumberC3=1024, the initial value of each parameter is in convolution kernel
For convolutional layer C4, C4 layers of output characteristic pattern quantity OutputMaps are enabledC4The width of=32, C4 layers of output characteristic pattern
OutputSizeC4=20, C4 layers of convolution kernel width KernelSizeC4=9, C4 layers of offset parameter are initialized as zero, C4 layers
Convolution kernel kC4Quantity KernelNumberC4=1024, the initial value of each parameter is in convolution kernel
For convolutional layer C5, C5 layers of output characteristic pattern quantity OutputMaps are enabledC5The width of=32, C5 layers of output characteristic pattern
OutputSizeC5=4, C5 layers of convolution kernel width KernelSizeC5=7, C5 layers of offset parameter are initialized as zero, C5 layers
Convolution kernel kC5Quantity KernelNumberC5=1024, the initial value of each parameter is in convolution kernel
Step 2-1-2 constructs hidden layer:For hidden layer lH,lH∈ { H1, H2, H3, H4, H5 }, determines the following contents:It is hidden
Output characteristic pattern quantity containing layerConvolution kernelAnd offset parameterFor convolution kernel, need really
Determine the width of convolution kernelThe quantity of convolution kernelIts value is that hidden layer inputs and exports spy
The product of figure quantity is levied, and convolution kernel is constructed according to Xavier initial method;For offset parameter, quantity and hidden layer
It is identical to export characteristic pattern quantity;lHLayer output characteristic pattern width be It is defeated with corresponding convolutional layer
The width for entering characteristic pattern is consistent;
For hidden layer H1, H1 layers of output characteristic pattern quantity OutputMaps is enabledH1The width of=4, H1 layers of output characteristic pattern
Spend OutputSizeH1=280, H1 layers of convolution kernel width KernelSizeH1=9, H1 layers of offset parameter biasH1Zero is initialized as,
H1 layers of convolution kernel kH1Quantity KernelNumberH1=48, the initial value of each parameter in convolution kernelRand () is for generating random number;
For hidden layer H2, H2 layers of output characteristic pattern quantity OutputMaps are enabledH2The width of=8, H2 layers of output characteristic pattern
OutputSizeH2=136, H2 layers of convolution kernel width KernelSizeH2=9, H2 layers of offset parameter are initialized as zero, H2 layers
Convolution kernel kH2Quantity KernelNumberH2=256, the initial value of each parameter is in convolution kernel
For hidden layer H3, H3 layers of output characteristic pattern quantity OutputMaps are enabledH3The width of=8, H3 layers of output characteristic pattern
OutputSizeH3=64, H3 layers of convolution kernel width KernelSizeH3=9, H3 layers of offset parameter are initialized as zero, H3 layers
Convolution kernel kH3Quantity KernelNumberH3=256, the initial value of each parameter is in convolution kernel
For hidden layer H4, H4 layers of output characteristic pattern quantity OutputMaps are enabledH4The width of=8, H4 layers of output characteristic pattern
OutputSizeH4=28, H4 layers of convolution kernel width KernelSizeH4=9, H4 layers of offset parameter are initialized as zero, H4 layers
Convolution kernel kH4Quantity KernelNumberH4=256, the initial value of each parameter is in convolution kernel
For hidden layer H5, H5 layers of output characteristic pattern quantity OutputMaps are enabledH5The width of=8, H5 layers of output characteristic pattern
OutputSizeH5=10, H5 layers of offset parameter are initialized as zero, H5 layers and include 256 weighting parameter kH5, each weight ginseng
Several initial values are
Step 2-1-3 constructs down-sampling layer:In down-sampling layer do not include need training parameter, by down-sampling layer S1,
The sampling core of S2, S3 and S4 are initialized asFor down-sampling layer lS,lS∈ { S1, S2, S3, S4 }, output
Characteristic pattern quantityIt is consistent with the output characteristic pattern quantity of one layer of convolutional layer thereon, output characteristic pattern is wide
DegreeIt is the 1/2 of the output characteristic pattern width of one layer of convolutional layer thereon, formula is expressed as follows:
Step 2-1-4, structural classification device layer:Classifier layer is made of a full articulamentum F1, and F1 layers of weighting parameter is
Horizontal weighting parameter matrix W H and vertical weighting parameter matrix W V, size are 41 × 512, are enabled each in weighting parameter matrix
The initial value of a parameter isOffset parameter is Horizontal offset parameter BH and vertical off setting parameter BV,
It is initialized as 41 × 1 one-dimensional null vector.
Step 5 includes the following steps:
Step 5-1, RDSN calculate probability vector:It is extracted in a sub-network by the alternate treatment of convolutional layer and down-sampling layer
The image sequence characteristic of input is handled in classifier layer by Softmax function, and level probability vector HPV and vertical is obtained
Probability vector VPV;
Step 5-2, PPL export forecast image:Convolution kernel of the HPV and VPV that step 5-1 is obtained as probabilistic forecasting layer,
By the last piece image in input image sequence successively with VPV, HPV phase convolution, the output forecast image of propagated forward is obtained.
Step 5-1 includes the following steps:
Step 5-1-1, judges network layer type:Indicate the network layer in current RDSN with l, the value of l be followed successively by H1,
C1, S1, H2, C2, S2, H3, C3, S3, H4, C4, S4, H5, C5, F1 }, initial value H1;The type of network layer l is judged, if l
∈ { H1, H2, H3, H4, H5 }, then l is hidden layer, executes step 5-1-2;If l ∈ { C1, C2, C3, C4, C5 }, then l is convolution
Layer executes step 1-5-1-3;If l ∈ { S1, S2, S3, S4 }, then l is down-sampling layer, executes step 5-1-4;If l=F1, l
For classifier layer, step 5-1-5 is executed;The output characteristic pattern of this training convolutional layer is denoted as a in training processC', wherein C
∈ { C1, C2, C3, C4, C5 }, aC' initial value be null matrix;
Step 5-1-2 handles hidden layer:There is l=l at this timeH,lH∈ { H1, H2, H3, H4, H5 } is divided into two kinds of situations:
Work as lHWhen ∈ { H1, H2, H3, H4 }, calculating l firstHJ-th of output characteristic pattern of layerIf lH=H1, then C=
C1, by zero pixel filling by aC' in corresponding characteristic pattern width expand toIt is again that it is corresponding with this layer
Convolution nuclear phase convolution, convolution results are summed, and summed result adds lHJ-th of offset parameter of layerIt is activated by ReLU
Function processing, obtains lHJ-th of output characteristic pattern of layerCalculation formula is as follows:
In above formula, Expand_Zero () indicates zero extended function,For lHLayer i-th of input feature vector figure and j-th
The corresponding convolution kernel of characteristic pattern is exported,For lHJ-th of biasing of layer, nh are the input feature vector figure of current hidden layer
Number,Indicate lHI-th of input feature vector figure of layer,Value by input feature vector figure width and convolution kernel
Size determines, and has
Work as lHWhen=H5, H5 layers of j-th of output characteristic pattern is calculated firstBy zero pixel filling by aC5' feature
Figure resolution ratio is expanded to 10 × 10, then the corresponding weighting parameter of itself and this layer is multiplied, and calculated result is summed, summed result adds
J-th of offset parameter of upper H5 layerIt handles, obtains by ReLU activation primitiveCalculation formula is as follows:
In above formula,For H5 layers of i-th of input feature vector figure weighting parameter corresponding with j-th of output characteristic pattern;
Successively calculate lHAll output characteristic pattern of layer, obtains lHThe output characteristic pattern of layerL is updated to l+1, and is returned
It returns step 5-1-1 and judges network type, carry out the operation of next network layer;
Step 5-1-3 handles convolutional layer:There is l=l at this timeC,lC∈ { C1, C2, C3, C4, C5 }, first calculating lCThe of layer
J output characteristic patternBy lCThe input feature vector figure convolution nuclear phase convolution corresponding with this layer respectively of layer, convolution results are asked
L is added with, summed resultCJ-th of offset parameter of layerIt handles, obtains using ReLU activation primitiveIt calculates public
Formula is as follows:
In above formula,For lCI-th of input feature vector figure convolution kernel corresponding with j-th of output characteristic pattern of layer, nc are
The input feature vector figure number of convolutional layer,Indicate lCI-th of input feature vector figure of layer, while being also lC- 1 layer of i-th of output
Characteristic pattern, * representing matrix convolution, if lC=C1, then lC- 1 layer is input layer;
Successively calculate lCAll output characteristic pattern of layer, obtains lCThe output characteristic pattern of layerWithValue update aC'(lC
=C, such as work as lCWhen=C1, then a is usedC1Update aC1'), l is updated to l+1, judges network class for simultaneously return step 1-5-1-1
Type carries out the operation of next network layer;
Step 5-1-3 handles down-sampling layer:There is l=l at this timeS,lS∈ { S1, S2, S3, S4 }, step 5-1-2 is obtained
The output characteristic pattern of convolutional layer respectively withPhase convolution, then sampled with step-length for 2, sampling obtains lSLayer
Export characteristic patternCalculation formula is as follows:
In above formula, Sample () indicates that step-length is 2 sampling processing, lS- 1 indicates the previous convolution of current down-sampling layer
Layer,Indicate lSThe output characteristic pattern of layerIn j-th of output characteristic pattern, obtain lSThe output characteristic pattern of layerAfterwards, more by l
It is newly l+1, and return step 5-1-1 judges network type, carries out the operation of next network layer;
Step 5-1-4 calculates F1 layers of probability vector:There is l=F1 at this time, by matrixing, by the 32 width resolution ratio of C5
For 4 × 4 output characteristic pattern with column sequential deployment, obtain the output feature vector a for the F1 layer that resolution ratio is 512 × 1F1, respectively
Calculate horizontal weighting parameter matrix W H and aF1Apposition, vertical weighting parameter matrix W V and aF1Apposition, calculated result is distinguished
With Horizontal offset parameter BH, vertical off setting parameter BV sum, through Softmax function processing after obtain level probability vector HPV and
Vertical probability vector VPV, specific formula for calculation are as follows:
By its vertical probability vector VPV transposition, final vertical probability vector is obtained.
Step 5-2 includes the following steps:
Step 5-2-1 predicts DC1 layers of vertical direction:By last width input picture of input layer and vertical probability vector
VPV phase convolution obtains the DC1 layer that resolution ratio is 240 × 280 and exports characteristic pattern aDC1;
Step 5-2-2 predicts DC2 layers of vertical direction:By DC1 layers of output characteristic pattern aDC1It is rolled up with level probability vector HPV phase
Product, obtains the output forecast image of propagated forward, and resolution ratio is 240 × 240.
Step 6 includes the following steps:
Step 6-1 calculates PPL error term:By pair in the step 5-2-2 forecast image obtained and the training sample of input
Sighting target label ask poor, calculate the error term of DC2 layers, DC1 layers, finally acquire the error term δ of level probability vectorHPVWith vertical probability
The error term δ of vectorVPV;
Step 6-2 calculates RDSN error term:According to the error term δ of level probability vectorHPVWith the mistake of vertical probability vector
Poor item δVPV, classification layer F1, convolutional layer (C5, C4, C3, C2, C1) hidden layer (H5, H4, H3, H2, H1) are successively calculated from rear to preceding
With the error term of down-sampling layer (S4, S3, S2, S1), the resolution ratio of any layer error term matrix acquired and the output of this layer are special
The resolution ratio for levying figure is consistent;
Step 6-3 calculates gradient:The error term of each network layer in RDSN is calculated according to the error term that step 6-2 is obtained
To the gradient value of this layer of weighting parameter and offset parameter;
Step 6-4, undated parameter:By the gradient value of the weighting parameter of the step 6-3 each network layer obtained and offset parameter
It is multiplied by the learning rate of RDCNN, obtains the update item of each network layer weighting parameter and offset parameter, former weighting parameter and biasing are joined
Number asks poor with the update item respectively, obtains updated weighting parameter and offset parameter.
Step 6-1 includes the following steps:
Step 6-1-1 calculates dynamic convolutional layer DC2 error term:The forecast image that step 5-2-2 is obtained and this group of sample
Control label ask poor, obtain size be 240 × 240 error term matrix deltaDC2;
Step 6-1-2 calculates dynamic convolutional layer DC1 error term:By zero padding by DC2 layers of error term matrix deltaDC2It opens up
Exhibition is 240 × 320, by level probability Vector rotation 180 degree, by the error term matrix after expansion and the level probability after overturning to
Phase convolution is measured, DC1 layers of error term matrix delta is obtainedDC1, size is 240 × 280, and formula is as follows:
δDC1=Expand_Zero (δDC2) * rot180 (HPV),
In above formula, rot180 () indicates that angle is 180 ° of rotation function, and 2 × 2 matrix zero is extended for 4 × 4
Matrix, the matrix after zero expansion, the region that central resolution ratio is 2 × 2 is consistent with original matrix, remaining position is filled out with zero pixel
It fills;
Step 6-1-3 calculates probability vector error term:The error term for calculating level probability vector HPV, it is defeated by DC1 layers
Characteristic pattern and error term matrix delta outDC2Phase convolution obtains 1 × 41 row vector after convolution, which is the error term δ of HPVHPV,
Formula is as follows:
δHPV=aDC1*δDC2,
The error term for calculating vertical probability vector VPV, by the input feature vector figure of input layer and error term matrix deltaDC1Mutually roll up
It is long-pending, 41 × 1 column vector is obtained after convolution, which is the error term δ of VPVVPV, formula is as follows:
In above formula,For the last piece image in the input image sequence of training sample;
Step 6-2 includes the following steps:
Step 6-2-1 calculates classifier layer F1 error term:By the error term δ of the step 6-1-3 probability vector obtainedVPVWith
δHPVWeighting parameter matrix W V vertical with F1 layers and horizontal weighting parameter matrix W H carry out matrix multiple respectively, then by matrix
Apposition is summed and is averaged, and F1 layers of error term δ is obtainedF1, formula is as follows:
In above formula, × representing matrix apposition, ()TThe transposition for representing matrix, obtained δF1Size be 512 × 1;
Step 6-2-2 calculates convolutional layer C5 error term:By matrixing, by the mistake of the F1 layer obtained in step 6-2-1
Poor item δF1It is transformed to the matrix that 32 resolution ratio are 4 × 4Obtain C5 layers of error term δC5,It indicates to become
The matrix that the 32nd resolution ratio after changing is 4 × 4;
Step 6-2-3, judges network layer type:The network layer in the RDSN being presently in is indicated with l, and the value of l is successively
For { H5, S4, C4, H4, S3, C3, H3, S2, C2, H2, S1, C1, H1 }, l initial value is H5, the type of network layer l is judged, if l
∈ { H5, H4, H3, H2, H1 }, then l is hidden layer, executes step 6-2-4;If l ∈ { S4, S3, S2, S1 }, then l is down-sampling
Layer executes step 6-2-5, if l ∈ { C4, C3, C2, C1 }, then l is convolutional layer, executes step 6-2-6;
Step 6-2-4 calculates hidden layer error term:L=l at this timeH,lH∈ { H5, H4, H3, H2, H1 } calculates lHThe of layer
I error term matrixBy zero padding respectively by each error term matrix delta of l+1 layers of convolutional layerl+1It expands to width and is
ExpandSizel+1:
ExpandSizel+1=OutputSizel+1+2·(KernelSizel+1- 1),
Again by corresponding convolution kernel180 degree is rotated, then rolls up the matrix after expansion and the convolution nuclear phase after overturning
Product, and convolution results are summed, obtain lHI-th of error term matrix of layerFormula is as follows:
In above formula, nc indicates the error term number of l+1 layers of convolutional layer, numerical value and l+1 layers of output characteristic pattern quantity phase
Together, and there is nc=OutputMapsl+1;
All error term matrixes are successively calculated, l is obtainedHThe output characteristic pattern of layerL is updated to l-1, and is returned
Step 6-2-3 judges network type, calculates the error term of a upper network layer;
Step 6-2-5 calculates down-sampling layer error term:L=l at this timeS,lS∈ { S4, S3, S2, S1 } calculates lSThe i-th of layer
A error term matrixBy zero padding respectively by each error term matrix delta of l+2 layers of convolutional layerl+2It expands to width and is
ExpandSizel+2:
ExpandSizel+2=OutputSizel+2+2·(KernelSizel+2- 1),
Again by corresponding convolution kernel180 degree is rotated, then rolls up the matrix after expansion and the convolution nuclear phase after overturning
Product, and convolution results are summed, obtain lSI-th of error term matrix of layerFormula is as follows:
In above formula, nc indicates the error term number of l+2 layers of convolutional layer, numerical value and l+2 layers of output characteristic pattern quantity phase
Together, and there is nc=OutputMapsl+2;
All error term matrixes are successively calculated, l is obtainedSThe output characteristic pattern δ of layerlS, l is updated to l-1, and return to step
Rapid 6-2-3 judges network type, calculates the error term of a upper network layer;
Step 6-2-6 calculates convolutional layer error term:There is l=l at this timeC,lC∈ { C4, C3, C2, C1 }, due to step 6-2-3
The initial value of middle l is H5, therefore is not in lCThe case where=C5, for lCI-th of error term matrix of layerFirst to l+1
Corresponding i-th of error term matrix in layer down-sampling layerIt is up-sampled, it will when up-samplingIn each element mistake
For poor entry value average mark into sampling area, obtaining resolution ratio is OutputSizelC×OutputSizelCUp-sampling matrix, then
Activation primitive is calculated in lCThe inner product of derivative and the up-sampling matrix acquired at layer character pair figure, obtains lCI-th of mistake of layer
Poor item matrixFormula is as follows:
In above formula, representing matrix inner product, ReLU'() indicate the derivative of ReLU activation primitive, form is as follows:
UpSamlpe () indicates up-sampling function, the corresponding up-sampling of each of original image pixel after up-sampling
Region in each of original pixel value mean allocation to sampling area pixel, successively calculates all error term matrixes, obtains
To lCThe output characteristic pattern of layer
Step 6-2-7, l layers are convolutional layer, i.e. l=l at this timeC, it is divided into two kinds of situations later:
If l ≠ C1, l is updated to l-1, and return step 6-2-3 judges network type, calculates a upper network layer
Error term;
If l=C1, the calculating of step 6-2 sub-network error term terminates;
Step 6-3 includes the following steps:
Step 6-3-1 calculates convolutional layer error term to the gradient of convolution kernel:Use lCIndicate currently processed convolutional layer, lC∈
{ C1, C2, C3, C4, C5 } successively calculates each convolutional layer error term to the gradient of convolution kernel, by the i-th of convolutional layer since C1 layers
A input feature vector figureWith lCJ-th of error term matrix of layerPhase convolution, convolution results are the gradient of corresponding convolution kernel
ValueFormula is as follows:
In above formula,WithRespectively indicate lCThe output characteristic pattern number and l of layerC-
1 layer of output characteristic pattern number;
Step 6-3-2 calculates each convolutional layer error term to the gradient of biasing:Use lCIndicate currently processed convolutional layer, lC∈
{ C1, C2, C3, C4, C5 } successively calculates each convolutional layer error term to the gradient of biasing, by l since C1 layersCJ-th of mistake of layer
Poor item matrixIn all elements sum, obtain j-th of this layer biasing gradient valueFormula is as follows:
In above formula, Sum () expression sums to all elements of matrix;
Step 6-3-3 calculates hidden layer error term to the gradient of convolution kernel:Use lHIndicate currently processed hidden layer, lH∈
{ H1, H2, H3, H4, H5 } successively calculates each convolutional layer error term to the gradient of convolution kernel, first to hidden layer since H1 layers
Error term is cut, and is retained central width and isPart, be denoted asWork as lH=H5
When, retain the part that H5 layers of error term center width are 4 × 4, then by i-th of input feature vector figure of hidden layerWith's
J-th of component phase convolution, convolution results are the gradient value of corresponding convolution kernelFormula is as follows:
In above formula,WithRespectively indicate lHThe output characteristic pattern number and l of layerH-
1 layer of output characteristic pattern number;
Step 6-3-4 calculates each hidden layer error term to the gradient of biasing:Use lHIndicate currently processed hidden layer, lH∈
{ H1, H2, H3, H4, H5 } successively calculates each convolutional layer error term to the gradient of biasing since H1 layers, will obtain in step 6-3-3
It arrivesAll elements in j-th of component are summed, and the gradient value of j-th of this layer biasing is obtainedFormula is as follows
It is shown:
In above formula, Sum () expression sums to all elements of matrix;
Step 6-3-5 calculates F1 layers of error term to the gradient of weighting parameter:Calculate separately level probability vector with it is vertical general
The error term δ of rate vectorHPV、δVPVWith F1 layers of error term δF1Inner product, calculated result be F1 layers of error term to weighting parameter WH, WV
Gradient value, formula is as follows:
▽ WH=(δHPV)T×(δF1)T,
▽ WV=δVPV×(δF1)T,
In above formula, ▽ WH is gradient value of the error term to horizontal weighting parameter, and ▽ WV is error term to vertical weighting parameter
Gradient value;
Step 6-3-6 calculates F1 layers of error term to the gradient of offset parameter:By level probability vector and vertical probability vector
Error term δHPV、δVPVIt is public respectively as F1 layers of error term to the gradient value of Horizontal offset parameter BH and vertical off setting parameter BV
Formula is as follows:
▽ BH=(δHPV)T,
▽ BV=δVPV,
In above formula, ▽ BH is gradient value of the error term to Horizontal offset parameter, and ▽ BV is error term to vertical off setting parameter
Gradient value;
Step 6-4 includes the following steps:
Step 6-4-1 updates each convolutional layer weighting parameter:Each convolutional layer error term that step 6-3-1 is obtained is to convolution
The gradient of core is multiplied by the learning rate of RDCNN, obtains the correction term of convolution kernel, then former convolution kernel and the correction term are asked poor, obtains
The convolution kernel of updateFormula is as follows:
Step 6-4-2 updates each convolutional layer offset parameter:Each convolutional layer error term that step 6-3-2 is obtained is to biasing
Gradient be multiplied by the learning rate of RDCNN, obtain the correction term of offset parameter, then former bias term and the correction term are asked poor, obtain
The bias term of updateFormula is as follows:
Step 6-4-3 updates each hidden layer weighting parameter:Each hidden layer error term that step 6-3-3 is obtained is to convolution
The gradient of core is multiplied by the learning rate of RDCNN, obtains the correction term of convolution kernel, then former convolution kernel and the correction term are asked poor, obtains
The convolution kernel of updateFormula is as follows:
Step 6-4-4 updates each hidden layer offset parameter:Each hidden layer error term that step 6-3-4 is obtained is to biasing
Gradient be multiplied by the learning rate of RDCNN, obtain the correction term of offset parameter, then former bias term and the correction term are asked poor, obtain
The bias term of updateFormula is as follows:
Step 6-4-5 updates F1 layers of weighting parameter:The F1 layer error term that step 6-3-5 is obtained to weighting parameter WH and
The gradient value of WV is multiplied by the learning rate of RDCNN, obtains the correction term of weighting parameter, then by former weighting parameter WH and WV respectively with ask
The correction term obtained asks poor, the WH and WV updated, and formula is as follows:
WH=WH- λ ▽ WH,
WV=WV- λ ▽ WV;
Step 6-4-6 updates F1 layers of offset parameter:The F1 layer error term that step 6-3-6 is obtained to offset parameter BH and
The gradient value of BV is multiplied by the learning rate of RDCNN, obtains the correction term of offset parameter, then by former offset parameter BH and BV respectively with ask
The correction term obtained asks poor, the BH and BV updated, and formula is as follows:
BH=BH- λ ▽ BH,
BV=BV- λ ▽ BV.
Beneficial effect:The present invention realizes Radar Echo Extrapolation using convolutional neural networks (CNN) image processing techniques, proposes
A kind of circulation dynamic convolutional neural networks (RDCNN) structure, the network is by circulation dynamic sub-network (RDSN) and probabilistic forecasting
Layer (PPL) composition, has dynamic characteristic and cycle characteristics.The convolution kernel of PPL is calculated by RDSN, the radar return with input
There are mapping relations for image, therefore the convolution kernel still is able to the difference according to input in the RDCNN on-line testing stage and changes,
Make network that there is dynamic characteristic;RDSN increases hidden layer on the basis of traditional CNN model, by hidden layer and convolutional layer structure
It at loop structure, can recursively retain history training information by loop structure, make network that there is cycle characteristics.Using a large amount of
Radar return image data train RDCNN, make network convergence, trained network can preferably realize Radar Echo Extrapolation.
Detailed description of the invention
The present invention is done with reference to the accompanying drawings and detailed description and is further illustrated, it is of the invention above-mentioned or
Otherwise advantage will become apparent.
Fig. 1 is flow chart of the present invention.
Fig. 2 is circulation dynamic convolutional neural networks initialization model structure chart.
Fig. 3 is circulation dynamic sub-network structural map.
Fig. 4 is probabilistic forecasting layer structural map.
Fig. 5 is that matrix zero expands schematic diagram.
Fig. 6 is the process schematic up-sampled to 2 × 2 matrix.
Specific embodiment
The present invention will be further described with reference to the accompanying drawings and embodiments.
As shown in Figure 1, the invention discloses the Radar Echo Extrapolation model trainings based on circulation dynamic convolutional neural networks
Method includes the following steps:
Step 1, dynamic convolutional neural networks RDCNN off-line training is recycled:Input training image collection, to training image collection into
Line number Data preprocess obtains training sample set, designs RDCNN structure, and initialize network training parameter;Utilize training sample set
The orderly image sequence of training RDCNN, input obtain a width forecast image by propagated forward, calculate forecast image and to sighting target
Error between label updates the weighting parameter and offset parameter of network by backpropagation, repeats this process until prediction result
Reach trained termination condition, obtains convergent RDCNN model;
Step 2, RDCNN on-line prediction:Input test image set carries out data prediction to test chart image set, is surveyed
Sample set is tried, then by the RDCNN model obtained in test sample collection input step 1, is calculated by network propagated forward general
Rate vector, and by input image sequence last width radar return image and obtained probability vector phase convolution, obtain pre-
The Radar Echo Extrapolation image of survey.
Step 1 includes the following steps:
Step 1-1, data prediction:Training image collection is inputted, the every piece image concentrated to training image standardizes
Change processing, converts every piece image to 280 × 280 gray level image, obtains image collection, draw to gray level image set
Point, construction includes the training sample set of TrainsetSize group sample;
Step 1-2 initializes RDCNN:RDCNN structure is designed, the circulation dynamic subnet of generating probability vector is configured to
String bag network (Recurrent Dynamic Sub-network, RDSN) and the probability for predicting future time instance radar return
Prediction interval (Probability Prediction Layer, PPL), provides the initialization model of RDCNN for off-line training step,
As shown in Fig. 2, for circulation dynamic convolutional neural networks initialization model structure chart;
Step 1-3 initializes the training parameter of RDCNN:E-learning rate λ=0.0001 is enabled, the training stage inputs every time
Sample size BatchSize=10, most large quantities of frequency of training of training sample setCurrently
Criticize frequency of training BatchNum=1, the maximum number of iterations IterationMax=40 of network training, current iteration number
IterationNum=1;
Step 1-4 reads training sample:By the way of batch training, the training sample obtained from step 1-1 is trained every time
It concentrates and reads BatchSize group training sample, every group of training sample is { x1,x2,x3,x4, y }, it altogether include 5 width images, wherein
{x1,x2,x3,x4It is used as input image sequence, y is corresponding control label;
Step 1-5, propagated forward:In RDSN extract input image sequence feature, obtain level probability vector HPV and
Vertical probability vector VPV;In probabilistic forecasting layer, by the last piece image in input image sequence successively with VPV, HPV phase
Convolution obtains the output forecast image of propagated forward;
Step 1-6, backpropagation:The error term that probability vector is acquired in PPL, further according to probability vector error term from
Afterwards to the preceding layer-by-layer error term for calculating each network layer in RDSN, so calculate in each network layer error term to weighting parameter and
The gradient of offset parameter utilizes obtained gradient updating network parameter;
Step 1-7, off-line training step control:Whole control is carried out to the offline neural metwork training stage, is divided into following
Three kinds of situations:
If training sample is concentrated there are still original training sample, i.e. BatchNum < BatchMax, then step is returned
Rapid 1-4 continues to read BatchSize group training sample, carries out network training;
If training sample, which is concentrated, is not present original training sample, i.e. BatchNum=BatchMax, and current net
Network the number of iterations is less than maximum number of iterations, i.e. IterationNum < IterationMax then enables BatchNum=1, returns
Step 1-4 continues to read BatchSize group training sample, carries out network training;
If training sample, which is concentrated, is not present original training sample, i.e. BatchNum=BatchMax, and network changes
Generation number reaches maximum number of iterations, i.e. IterationNum=IterationMax, then terminates RDCNN off-line training step,
Obtain convergent RDCNN model.
Step 1-1 data prediction includes the following steps:
Step 1-1-1, sampling:The image that training image is concentrated is sequentially arranged, and constant duration is distributed, when
Between between be divided into 6 minutes, altogether include NTrainWidth image determines TrainsetSize by following formula:
In above formula, Mod (NTrain, 4) and indicate NTrainTo 4 modulus,Expression is not more thanMaximum integer, ask
After TrainsetSize, training image is retained by sampling and concentrates preceding 4 × TrainsetSize+1 width image, when sampling passes through
Deleting training image concentrates last image to meet the requirements amount of images;
Step 1-1-2, normalized images:Image transformation, normalization operation, by original point are carried out to the image that sampling obtains
The color image that resolution is 2000 × 2000 is converted into the gray level image that resolution ratio is 280 × 280;
Step 1-1-3 constructs training sample set:Training sample set is constructed using the gray level image that step 1-1-2 is obtained, it will
Gray level image concentrates every four adjacent images, i.e. { 4N+1,4N+2,4N+3,4N+4 } width image as one group of list entries,
[4 × (N+1)+1] width image is by cutting, control of the part that the central resolution ratio of reservation is 240 × 240 as corresponding sample
Label, for N group sampleIts make is as follows:
In above formula, G4N+1Indicate gray level image concentrate 4N+1 width image, N is positive integer, and have N ∈ [0,
TrainsetSize-1], Crop () indicates trimming operation, and the portion that original image center size is 240 × 240 is retained after cutting
Point, finally obtain the training sample set comprising TrainsetSize group training sample;
Wherein, step 1-1-2 includes the following steps:
Step 1-1-2-1, image conversion:Gray level image is converted by the step 1-1-1 image sampled, passes through cutting
Retain the part that original image center resolution ratio is 560 × 560 to obtain the image resolution ratio boil down to 280 × 280 after cutting
The grayscale image for being 280 × 280 to resolution ratio;
Step 1-1-2-2, data normalization:By each of the grayscale image obtained in step 1-1-2-1 pixel
Value is mapped to [0~1] from [0~255].
Step 1-2 includes the following steps:
Step 1-2-1, construction circulation dynamic sub-network RDSN, as shown in figure 3, for circulation dynamic sub-network structural map:
Sub-network is made of 10 network layers, is followed successively by convolutional layer C1, down-sampling layer S1, hidden layer H1, volume from front to back
Lamination C2, down-sampling layer S2, hidden layer H2, convolutional layer C3, down-sampling layer S3, hidden layer H3, convolutional layer C4, down-sampling layer S4,
Hidden layer H4, convolutional layer C5, hidden layer H5 and classifier layer F1;
Step 1-2-2 constructs probabilistic forecasting layer PPL, as shown in figure 4, being probabilistic forecasting layer structural map:
Dynamic convolutional layer DC1 and dynamic convolutional layer DC2 is constructed in probabilistic forecasting layer, the vertical probability vector that RDSN is exported
Convolution kernel of the VPV as dynamic convolutional layer DC1, convolution kernel of the level probability vector HPV as dynamic convolutional layer DC2;
Wherein, step 1-2-1 includes the following steps:
Step 1-2-1-1 constructs convolutional layer:For convolutional layer lC,lC∈ { C1, C2, C3, C4, C5 }, determines the following contents:Volume
The output characteristic pattern quantity of laminationConvolution kernelAnd offset parameterFor convolution kernel, it is thus necessary to determine that volume
The width of product coreThe quantity of convolution kernelThe value is that convolutional layer inputs and exports characteristic pattern quantity
Product, and according to Xavier initial method construct convolution kernel;For offset parameter, the output characteristic pattern number of quantity and this layer
It measures identical;lCLayer output characteristic pattern width be Value by convolutional layer lCInput feature vector figure point
The width of resolution and convolution kernelIt codetermines, i.e., Indicate convolutional layer lCUpper one layer of convolutional layer output characteristic pattern width;
For convolutional layer C1, C1 layers of output characteristic pattern quantity OutputMaps is enabledC1The width of=12, C1 layers of output characteristic pattern
Spend OutputSizeC1=272, C1 layers of convolution kernel width KernelSizeC1=9, C1 layers of offset parameter biasC1It is initialized as
Zero, C1 layers of convolution kernel kC1Quantity KernelNumberC1=48, the initial value of each parameter is in convolution kernelRand () is for generating random number;
For convolutional layer C2, C2 layers of output characteristic pattern quantity OutputMaps are enabledC2The width of=32, C2 layers of output characteristic pattern
OutputSizeC2=128, C2 layers of convolution kernel width KernelSizeC2=9, C2 layers of offset parameter are initialized as zero, C2 layers
Convolution kernel kC2Quantity KernelNumberC2=384, the initial value of each parameter is in convolution kernel
For convolutional layer C3, C3 layers of output characteristic pattern quantity OutputMaps are enabledC3The width of=32, C3 layers of output characteristic pattern
OutputSizeC3=56, C3 layers of convolution kernel width KernelSizeC3=9, C3 layers of offset parameter are initialized as zero, C3 layers
Convolution kernel kC3Quantity KernelNumberC3=1024, the initial value of each parameter is in convolution kernel
For convolutional layer C4, C4 layers of output characteristic pattern quantity OutputMaps are enabledC4The width of=32, C4 layers of output characteristic pattern
OutputSizeC4=20, C4 layers of convolution kernel width KernelSizeC4=9, C4 layers of offset parameter are initialized as zero, C4 layers
Convolution kernel kC4Quantity KernelNumberC4=1024, the initial value of each parameter is in convolution kernel
For convolutional layer C5, C5 layers of output characteristic pattern quantity OutputMaps are enabledC5The width of=32, C5 layers of output characteristic pattern
OutputSizeC5=4, C5 layers of convolution kernel width KernelSizeC5=7, C5 layers of offset parameter are initialized as zero, C5 layers
Convolution kernel kC5Quantity KernelNumberC5=1024, the initial value of each parameter is in convolution kernel
Step 1-2-1-2 constructs hidden layer:For hidden layer lH,lH∈ { H1, H2, H3, H4, H5 }, determines the following contents:
The output characteristic pattern quantity of hidden layerConvolution kernelAnd offset parameterFor convolution kernel, need
Determine the width of convolution kernelThe quantity of convolution kernelIts value is that hidden layer is inputted and exported
The product of characteristic pattern quantity, and convolution kernel is constructed according to Xavier initial method;For offset parameter, quantity and hidden layer
Output characteristic pattern quantity it is identical;lHLayer output characteristic pattern width be With corresponding convolutional layer
The width of input feature vector figure is consistent;
For hidden layer H1, H1 layers of output characteristic pattern quantity OutputMaps is enabledH1The width of=4, H1 layers of output characteristic pattern
Spend OutputSizeH1=280, H1 layers of convolution kernel width KernelSizeH1=9, H1 layers of offset parameter biasH1Zero is initialized as,
H1 layers of convolution kernel kH1Quantity KernelNumberH1=48, the initial value of each parameter is in convolution kernelRand () is for generating random number;
For hidden layer H2, H2 layers of output characteristic pattern quantity OutputMaps are enabledH2The width of=8, H2 layers of output characteristic pattern
OutputSizeH2=136, H2 layers of convolution kernel width KernelSizeH2=9, H2 layers of offset parameter are initialized as zero, H2 layers
Convolution kernel kH2Quantity KernelNumberH2=256, the initial value of each parameter is in convolution kernel
For hidden layer H3, H3 layers of output characteristic pattern quantity OutputMaps are enabledH3The width of=8, H3 layers of output characteristic pattern
OutputSizeH3=64, H3 layers of convolution kernel width KernelSizeH3=9, H3 layers of offset parameter are initialized as zero, H3 layers
Convolution kernel kH3Quantity KernelNumberH3=256, the initial value of each parameter is in convolution kernel
For hidden layer H4, H4 layers of output characteristic pattern quantity OutputMaps are enabledH4The width of=8, H4 layers of output characteristic pattern
OutputSizeH4=28, H4 layers of convolution kernel width KernelSizeH4=9, H4 layers of offset parameter are initialized as zero, H4 layers
Convolution kernel kH4Quantity KernelNumberH4=256, the initial value of each parameter is in convolution kernel
For hidden layer H5, H5 layers of output characteristic pattern quantity OutputMaps are enabledH5The width of=8, H5 layers of output characteristic pattern
OutputSizeH5=10, H5 layers of offset parameter are initialized as zero.H5 layers include 256 weighting parameter kH5, each weight ginseng
Several initial values are
Step 1-2-1-3 constructs down-sampling layer:In down-sampling layer do not include need training parameter, by down-sampling layer S1,
The sampling core of S2, S3 and S4 are initialized asFor down-sampling layer lS,lS∈ { S1, S2, S3, S4 }, output
Characteristic pattern quantityIt is consistent with the output characteristic pattern quantity of one layer of convolutional layer thereon, output characteristic pattern is wide
DegreeIt is the 1/2 of the output characteristic pattern width of one layer of convolutional layer thereon, formula is expressed as follows:
Step 1-2-1-4, structural classification device layer:Classifier layer is made of a full articulamentum F1, F1 layers of weighting parameter
For horizontal weighting parameter matrix W H and vertical weighting parameter matrix W V, size is 41 × 512, is enabled every in weighting parameter matrix
The initial value of one parameter isOffset parameter is Horizontal offset parameter BH and vertical off setting parameter BV,
It is initialized as 41 × 1 one-dimensional null vector.
Step 1-5 includes the following steps:
Step 1-5-1, RDSN calculate probability vector:It is mentioned in a sub-network by the alternate treatment of convolutional layer and down-sampling layer
The image sequence characteristic for taking input is handled in classifier layer by Softmax function, is obtained level probability vector HPV and is hung down
Straight probability vector VPV;
Step 1-5-2, PPL export forecast image:Convolution of the HPV and VPV that step 1-5-1 is obtained as probabilistic forecasting layer
Core obtains the output prognostic chart of propagated forward by the last piece image in input image sequence successively with VPV, HPV phase convolution
Picture.
Step 1-5-1 includes the following steps:
Step 1-5-1-1, judges network layer type:Indicate that the network layer in current RDSN, the value of l are followed successively by with l
{ H1, C1, S1, H2, C2, S2, H3, C3, S3, H4, C4, S4, H5, C5, F1 }, initial value H1.Judge the class of network layer l
Type, if l ∈ { H1, H2, H3, H4, H5 }, then l is hidden layer, executes step 1-5-1-2;If l ∈ { C1, C2, C3, C4, C5 }, then
L is convolutional layer, executes step 1-5-1-3;If l ∈ { S1, S2, S3, S4 }, then l is down-sampling layer, executes step 1-5-1-4;If
L=F1, then l is classifier layer, executes step 1-5-1-5.By the output feature seal of this training convolutional layer in training process
For aC', wherein C ∈ { C1, C2, C3, C4, C5 }, aC' initial value be null matrix;
Step 1-5-1-2 handles hidden layer:There is l=l at this timeH,lH∈ { H1, H2, H3, H4, H5 } is divided into two kinds of feelings at this time
Condition:
Work as lHWhen ∈ { H1, H2, H3, H4 }, calculating l firstHJ-th of output characteristic pattern of layerPass through zero pixel filling
By aC' in corresponding characteristic pattern (if lH=H1, then C=C1) width expand toIt is again that it is corresponding with this layer
Convolution nuclear phase convolution, convolution results are summed, and summed result adds lHJ-th of offset parameter of layerIt is activated by ReLU
Function processing, obtainsCalculation formula is as follows:
In above formula, Expand_Zero () indicates zero extended function, as shown in figure 5, expand schematic diagram for matrix zero,
For lHI-th of input feature vector figure convolution kernel corresponding with j-th of output characteristic pattern of layer, nh are that the input of current hidden layer is special
Figure number is levied,Indicate lHI-th of input feature vector figure of layer,Value by input feature vector figure width and volume
The size of product core determines, and has
Work as lHWhen=H5, H5 layers of j-th of output characteristic pattern is calculated firstBy zero pixel filling by aC5' feature
Figure resolution ratio is expanded to 10 × 10, then the corresponding weighting parameter of itself and this layer is multiplied, and calculated result is summed, summed result adds
J-th of offset parameter of upper H5 layerIt handles, obtains by ReLU activation primitiveCalculation formula is as follows:
In above formula,For H5 layers of i-th of input feature vector figure weighting parameter corresponding with j-th of output characteristic pattern;
Successively calculate lHAll output characteristic pattern of layer, obtainsL is updated to l+1, and return step 1-5-1-1 sentences
Circuit network type carries out the operation of next network layer;
Step 1-5-1-3 handles convolutional layer:There is l=l at this timeC,lC∈ { C1, C2, C3, C4, C5 }, first calculating lCLayer
J-th of output characteristic patternBy lCThe input feature vector figure convolution nuclear phase convolution corresponding with this layer respectively of layer, convolution results are asked
L is added with, summed resultCJ-th of offset parameter of layerIt handles, obtains using ReLU activation primitiveIt calculates public
Formula is as follows:
In above formula,For lCI-th of input feature vector figure convolution kernel corresponding with j-th of output characteristic pattern of layer, nc are
The input feature vector figure number of convolutional layer,Indicate lCI-th of input feature vector figure of layer, while being also lC- 1 layer of i-th of output
Characteristic pattern, * representing matrix convolution, if lC=C1, then lC- 1 layer is input layer.
Successively calculate lCAll output characteristic pattern of layer, obtainsWithValue update aC'(lC=C, such as work as lC=
When C1, then a is usedC1Update aC1'), l is updated to l+1, network type is judged for simultaneously return step 1-5-1-1, carries out next
The operation of network layer;
Step 1-5-1-3 handles down-sampling layer:There is l=l at this timeS,lS∈ { S1, S2, S3, S4 }, step 1-5-1-2 is obtained
The output characteristic pattern of the convolutional layer arrived respectively withPhase convolution, then sampled with step-length for 2, sampling obtains lS
The output characteristic pattern of layerCalculation formula is as follows:
In above formula, Sample () indicates that step-length is 2 sampling processing, lS- 1 indicates the previous convolution of current down-sampling layer
Layer,Indicate lSThe output characteristic pattern of layerIn j-th of output characteristic pattern, obtain lSThe output characteristic pattern of layerAfterwards, more by l
It is newly l+1, and return step 1-5-1-1 judges network type, carries out the operation of next network layer;
Step 1-5-1-4 calculates F1 layers of probability vector:There is l=F1 at this time, by matrixing, 32 width of C5 are differentiated
The output characteristic pattern that rate is 4 × 4 obtains the output feature vector a for the F1 layer that resolution ratio is 512 × 1 with column sequential deploymentF1, point
Horizontal weighting parameter matrix W H and a are not calculatedF1Apposition, vertical weighting parameter matrix W V and aF1Apposition, by calculated result point
It does not sum with Horizontal offset parameter BH, vertical off setting parameter BV, obtains level probability vector HPV after the processing of Softmax function
With vertical probability vector VPV, specific formula for calculation is as follows:
By its vertical probability vector VPV transposition, final vertical probability vector is obtained;
Step 1-5-2 includes the following steps:
Step 1-5-2-1 predicts DC1 layers of vertical direction:By last width input picture of input layer and vertical probability to
VPV phase convolution is measured, the DC1 layer that resolution ratio is 240 × 280 is obtained and exports characteristic pattern aDC1;
Step 1-5-2-2 predicts DC2 layers of vertical direction:By DC1 layers of output characteristic pattern aDC1With level probability vector HPV phase
Convolution, obtains the output forecast image of propagated forward, and resolution ratio is 240 × 240.
Step 1-6 includes the following steps:
Step 1-6-1 calculates PPL error term:It will be in the step 1-5-2-2 forecast image obtained and the training sample of input
Control label ask poor, calculate DC2 layers, DC1 layers of error term, finally acquire the error term δ of level probability vectorHPVWith it is vertical
The error term δ of probability vectorVPV;
Step 1-6-2 calculates RDSN error term:According to the error term δ of level probability vectorHPVWith vertical probability vector
Error term δVPV, from it is rear to preceding successively calculate classification layer F1, convolutional layer (C5, C4, C3, C2, C1) hidden layer (H5, H4, H3, H2,
H1) and the error term of down-sampling layer (S4, S3, S2, S1), the output of the resolution ratio of any layer error term matrix acquired and this layer
The resolution ratio of characteristic pattern is consistent;
Step 1-6-3 calculates gradient:The mistake of each network layer in RDSN is calculated according to the error term that step 1-6-2 is obtained
Gradient value of the poor item to this layer of weighting parameter and offset parameter;
Step 1-6-4, undated parameter:By the ladder of the weighting parameter of the step 1-6-3 each network layer obtained and offset parameter
Angle value is multiplied by the learning rate of RDCNN, obtains the update item of each network layer weighting parameter and offset parameter, by former weighting parameter and partially
It sets parameter and asks poor with the update item respectively, obtain updated weighting parameter and offset parameter.
Step 1-6-1 includes the following steps:
Step 1-6-1-1 calculates dynamic convolutional layer DC2 error term:The forecast image and the group that step 1-5-2-2 is obtained
The control label of sample asks poor, obtains the error term matrix delta that size is 240 × 240DC2;
Step 1-6-1-2 calculates dynamic convolutional layer DC1 error term:By zero padding by DC2 layers of error term matrix deltaDC2
Expanding is 240 × 320, by level probability Vector rotation 180 degree, by the error term matrix after expansion and the level probability after overturning
Vector phase convolution obtains DC1 layers of error term matrix deltaDC1, size is 240 × 280, and formula is as follows:
δDC1=Expand_Zero (δDC2) * rot180 (HPV),
In above formula, rot180 () indicates that angle is 180 ° of rotation function, and 2 × 2 matrix zero is extended for 4 × 4
Matrix, the matrix after zero expansion, the region that central resolution ratio is 2 × 2 is consistent with original matrix, remaining position is filled out with zero pixel
It fills;
Step 1-6-1-3 calculates probability vector error term:The error term for calculating level probability vector HPV, by DC1 layers
Export characteristic pattern and error term matrix deltaDC2Phase convolution obtains 1 × 41 row vector after convolution, which is the error term of HPV
δHPV, formula is as follows:
δHPV=aDC1*δDC2,
The error term for calculating vertical probability vector VPV, by the input feature vector figure of input layer and error term matrix deltaDC1Mutually roll up
It is long-pending, 41 × 1 column vector is obtained after convolution, which is the error term δ of VPVVPV, formula is as follows:
In above formula,For the last piece image in the input image sequence of training sample;
Step 1-6-2 includes the following steps:
Step 1-6-2-1 calculates classifier layer F1 error term:By the error term of the step 1-6-1-3 probability vector obtained
δVPVAnd δHPVWeighting parameter matrix W V vertical with F1 layers and horizontal weighting parameter matrix W H carry out matrix multiple respectively, then by square
The apposition of battle array is summed and is averaged, and F1 layers of error term δ is obtainedF1, formula is as follows:
In above formula, × representing matrix apposition, ()TThe transposition for representing matrix, obtained δF1Size be 512 × 1;
Step 1-6-2-2 calculates convolutional layer C5 error term:By matrixing, the F1 layer that will be obtained in step 1-6-2-1
Error term δF1It is transformed to the matrix that 32 resolution ratio are 4 × 4Obtain C5 layers of error term δC5,Table
Show that transformed 32nd resolution ratio is 4 × 4 matrix;
Step 1-6-2-3, judges network layer type:Indicate the network layer in the RDSN being presently in l, the value of l according to
Secondary is { H5, S4, C4, H4, S3, C3, H3, S2, C2, H2, S1, C1, H1 }, and l initial value is H5, judges the type of network layer l, if
L ∈ { H5, H4, H3, H2, H1 }, then l is hidden layer, executes step 1-6-2-4;If l ∈ { S4, S3, S2, S1 }, then l is adopted under being
Sample layer executes step 1-6-2-5, if l ∈ { C4, C3, C2, C1 }, then l is convolutional layer, executes step 1-6-2-6;
Step 1-6-2-4 calculates hidden layer error term:L=l at this timeH,lH∈ { H5, H4, H3, H2, H1 } calculates lHLayer
I-th of error term matrixBy zero padding respectively by l+1 layers (convolutional layer) of each error term matrix deltal+1It expands to width
For ExpandSizel+1(ExpandSizel+1=OutputSizel+1+2·(KernelSizel+1- 1)), then by corresponding convolution
Core180 degree is rotated, then by the matrix after expansion and the convolution nuclear phase convolution after overturning, and convolution results is summed, is obtained
lHI-th of error term matrix of layerFormula is as follows:
In above formula, nc indicates the error term number of l+1 layers (convolutional layer), numerical value and l+1 layers of output characteristic pattern quantity
It is identical, and have nc=OutputMapsl+1。
All error term matrixes are successively calculated, l is obtainedHThe output characteristic pattern of layerL is updated to l-1, and returns to step
Rapid 1-6-2-3 judges network type, calculates the error term of a upper network layer;
Step 1-6-2-5 calculates down-sampling layer error term:L=l at this timeS,lS∈ { S4, S3, S2, S1 } calculates lSLayer
I-th of error term matrixBy zero padding respectively by each error term matrix delta of l+2 layers (corresponding convolutional layer)l+2It expands extremely
Width is ExpandSizel+2(ExpandSizel+2=OutputSizel+2+2·(KernelSizel+2It -1)), then will be corresponding
Convolution kernel180 degree is rotated, then by the matrix after expansion and the convolution nuclear phase convolution after overturning, and convolution results are summed,
Obtain lSI-th of error term matrix of layerFormula is as follows:
In above formula, nc indicates the error term number of l+2 layers (convolutional layer), numerical value and l+2 layers of output characteristic pattern quantity
It is identical, and have nc=OutputMapsl+2。
All error term matrixes are successively calculated, l is obtainedSThe output characteristic pattern of layerL is updated to l-1, and returns to step
Rapid 1-6-2-3 judges network type, calculates the error term of a upper network layer;
Step 1-6-2-6 calculates convolutional layer error term:There is l=l at this timeC,lC∈ { C4, C3, C2, C1 }, due to step 1-
The initial value of l is H5 in 6-2-3, therefore is not in lCThe case where=C5, for lCI-th of error term matrix of layer
First to corresponding i-th of the error term matrix in l+1 layers (down-sampling layer)It is up-sampled, as shown in fig. 6, for 2 × 2
The process schematic that is up-sampled of matrix, when up-sampling, willIn each element error entry value average mark to sample region
In domain, obtaining resolution ratio isUp-sampling matrix, then calculate activation primitive in lCLayer is corresponding
The inner product of derivative and the up-sampling matrix acquired at characteristic pattern, obtains lCI-th of error term matrix of layerThe following institute of formula
Show:
In above formula, representing matrix inner product, ReLU'() indicate the derivative of ReLU activation primitive, form is as follows:
UpSamlpe () indicates up-sampling function, the corresponding up-sampling of each of original image pixel after up-sampling
Region in each of original pixel value mean allocation to sampling area pixel, successively calculates all error term matrixes, obtains
To lCThe output characteristic pattern of layer
Step 1-6-2-7, l layers are convolutional layer, i.e. l=l at this timeC, it is divided into two kinds of situations later:
If l ≠ C1, l is updated to l-1, and return step 1-6-2-3 judges network type, calculates a upper network layer
Error term;
If l=C1, the calculating of step 1-6-2 sub-network error term terminates;
Step 1-6-3 includes the following steps:
Step 1-6-3-1 calculates convolutional layer error term to the gradient of convolution kernel:Use lCIndicate currently processed convolutional layer, lC
∈ { C1, C2, C3, C4, C5 } successively calculates each convolutional layer error term to the gradient of convolution kernel, by convolutional layer since C1 layers
I-th of input feature vector figureWith lCJ-th of error term matrix of layerPhase convolution, convolution results are the ladder of corresponding convolution kernel
Angle valueFormula is as follows:
In above formula,WithRespectively indicate lCThe output characteristic pattern number and l of layerC-
1 layer of output characteristic pattern number;
Step 1-6-3-2 calculates each convolutional layer error term to the gradient of biasing:Use lCIndicate currently processed convolutional layer, lC
∈ { C1, C2, C3, C4, C5 } successively calculates each convolutional layer error term to the gradient of biasing, by l since C1 layersCJ-th of layer
Error term matrixIn all elements sum, obtain j-th of this layer biasing gradient valueThe following institute of formula
Show:
In above formula, Sum () expression sums to all elements of matrix;
Step 1-6-3-3 calculates hidden layer error term to the gradient of convolution kernel:Use lHIndicate currently processed hidden layer, lH
∈ { H1, H2, H3, H4, H5 } successively calculates each convolutional layer error term to the gradient of convolution kernel, first to implicit since H1 layers
Layer error term is cut, and is retained central width and isPart (work as lHWhen=H5, protect
Staying H5 layers of error term center width is 4 × 4 part) it is denoted asThen by i-th of input feature vector figure of hidden layerWith
J-th of component phase convolution, convolution results are the gradient value of corresponding convolution kernelFormula is as follows:
In above formula,WithRespectively indicate lHThe output characteristic pattern number and l of layerH-
1 layer of output characteristic pattern number;
Step 1-6-3-4 calculates each hidden layer error term to the gradient of biasing:Use lHIndicate currently processed hidden layer, lH
∈ { H1, H2, H3, H4, H5 } successively calculates each convolutional layer error term to the gradient of biasing, by step 1-6-3-3 since H1 layers
Obtained inAll elements in j-th of component are summed, and the gradient value of j-th of this layer biasing is obtainedFormula
As follows:
In above formula, Sum () expression sums to all elements of matrix;
Step 1-6-3-5 calculates F1 layers of error term to the gradient of weighting parameter:Calculate separately level probability vector with it is vertical
The error term δ of probability vectorHPV、δVPVWith F1 layers of error term δF1Inner product, calculated result be F1 layers of error term to weighting parameter WH,
The gradient value of WV, formula are as follows:
▽ WH=(δHPV)T×(δF1)T,
▽ WV=δVPV×(δF1)T,
In above formula, ▽ WH is gradient value of the error term to horizontal weighting parameter, and ▽ WV is error term to vertical weighting parameter
Gradient value;
Step 1-6-3-6 calculates F1 layers of error term to the gradient of offset parameter:By level probability vector and vertical probability to
The error term δ of amountHPV、δVPVRespectively as F1 layers of error term to the gradient value of Horizontal offset parameter BH and vertical off setting parameter BV,
Formula is as follows:
▽ BH=(δHPV)T,
▽ BV=δVPV,
In above formula, ▽ BH is gradient value of the error term to Horizontal offset parameter, and ▽ BV is error term to vertical off setting parameter
Gradient value;
Step 1-6-4 includes the following steps:
Step 1-6-4-1 updates each convolutional layer weighting parameter:Each convolutional layer error term pair that step 1-6-3-1 is obtained
The gradient of convolution kernel is multiplied by the learning rate of RDCNN, obtains the correction term of convolution kernel, then former convolution kernel and the correction term are asked poor,
The convolution kernel updatedFormula is as follows:
In above formula, λ is the e-learning rate determined in step 1-3, λ=0.0001;
Step 1-6-4-2 updates each convolutional layer offset parameter:Each convolutional layer error term pair that step 1-6-3-2 is obtained
The gradient of biasing is multiplied by the learning rate of RDCNN, obtains the correction term of offset parameter, then former bias term and the correction term are asked poor,
The bias term updatedFormula is as follows:
Step 1-6-4-3 updates each hidden layer weighting parameter:Each hidden layer error term pair that step 1-6-3-3 is obtained
The gradient of convolution kernel is multiplied by the learning rate of RDCNN, obtains the correction term of convolution kernel, then former convolution kernel and the correction term are asked poor,
The convolution kernel updatedFormula is as follows:
In above formula, λ is the e-learning rate determined in step 1-3, λ=0.0001;
Step 1-6-4-4 updates each hidden layer offset parameter:Each hidden layer error term pair that step 1-6-3-4 is obtained
The gradient of biasing is multiplied by the learning rate of RDCNN, obtains the correction term of offset parameter, then former bias term and the correction term are asked poor,
The bias term updatedFormula is as follows:
Step 1-6-4-5 updates F1 layers of weighting parameter:The F1 layer error term that step 1-6-3-5 is obtained is to weighting parameter
The gradient value of WH and WV is multiplied by the learning rate of RDCNN, obtains the correction term of weighting parameter, then former weighting parameter WH and WV is distinguished
Ask poor with the correction term acquired, the WH and WV updated, formula is as follows:
WH=WH- λ ▽ WH,
WV=WV- λ ▽ WV;
Step 1-6-4-6 updates F1 layers of offset parameter:The F1 layer error term that step 1-6-3-6 is obtained is to offset parameter
The gradient value of BH and BV is multiplied by the learning rate of RDCNN, obtains the correction term of offset parameter, then former offset parameter BH and BV is distinguished
Ask poor with the correction term acquired, the BH and BV updated, formula is as follows:
BH=BH- λ ▽ BH,
BV=BV- λ ▽ BV.
Step 2 includes the following steps:
Step 2-1, data prediction:Input test image set, the every piece image concentrated to test image standardize
Change processing, converts every piece image to 280 × 280 gray level image, then divide to gray level image set, construction includes
The test sample collection of TestsetSize group sample;
Step 2-2, read test sample:The TestsetSize group test sample input that step 2-1 is obtained is by training
Circulation dynamic convolutional neural networks in;
Step 2-3, propagated forward:The image sequence characteristic for extracting input in a sub-network obtains level probability vector
HPVtestWith vertical probability vector VPVtest;In probabilistic forecasting layer, by the last piece image in input image sequence successively with
VPVtest、HPVtestPhase convolution obtains the final extrapolated image of circulation dynamic convolutional neural networks.
Step 2-1 includes the following steps:
Step 2-1-1, sampling:The image that test image is concentrated is sequentially arranged, and constant duration is distributed, when
Between between be divided into 6 minutes, altogether include NTestWidth image determines TestsetSize by following formula:
If Mod (NTest, 4)=0
If Mod (NTest, 4) ≠ 0
After acquiring TestsetSize, test image is retained by sampling and concentrates preceding 4 × TestsetSize+1 width image, is adopted
Last image is concentrated to meet the requirements amount of images by deleting test image when sample;
Step 2-1-2, image normalization:Image transformation, normalization operation, by original point are carried out to the image that sampling obtains
The color image that resolution is 2000 × 2000 is converted into the gray level image that resolution ratio is 280 × 280;
Step 2-1-3 constructs test sample collection:Test sample collection is constructed using the grayscale image image set that step 2-1-2 is obtained,
Gray level image is concentrated into every four adjacent images, i.e., { 4M+1,4M+2,4M+3,4M+4 } width image is as one group of input sequence
Column, [4 × (M+1)+1] width image is by cutting, and the part that the central resolution ratio of reservation is 240 × 240 is as corresponding sample
Label is compareed, wherein for positive integer, and there is M ∈ [0, TestsetSize-1] to obtain comprising TestsetSize group test sample
Test sample collection;
Step 2-1-2 includes the following steps:
Step 2-1-2-1, image conversion:Gray level image is converted by colored echo strength CAPPI image, then passes through sanction
It cuts and retains the part that original image center resolution ratio is 560 × 560, by the image resolution ratio boil down to 280 × 280 after cutting,
Obtain the grayscale image that resolution ratio is 280 × 280;
Step 2-1-2-2, data normalization:By each of the grayscale image obtained in step 1-1-2-1 pixel
Value is mapped to [0~1] from [0~255];
Step 2-3 includes the following steps:
Step 2-3-1 calculates sub-network probability vector:Pass through the alternate treatment of convolutional layer and down-sampling layer in a sub-network
The image sequence characteristic for extracting input, is then handled by Softmax function in classifier layer, obtains level probability vector
HPVtestWith vertical probability vector VPVtest;
Step 2-3-2 calculates probabilistic forecasting layer and exports image:The VPV that step 2-3-1 is obtainedtestAnd HPVtestAs probability
The convolution kernel of prediction interval, by the last piece image in input image sequence successively with VPVtestAnd HPVtestPhase convolution, is followed
The final extrapolated image of gyration state convolutional neural networks;
Step 2-3-1 includes the following steps:
Step 2-3-1-1, judges network layer type:Indicate that the network layer in current RDSN, the value of p are followed successively by with p
{ H1, C1, S1, H2, C2, S2, H3, C3, S3, H4, C4, S4, H5, C5, F1 }, initial value H1.Judge the class of network layer p
Type, if p ∈ { H1, H2, H3, H4, H5 }, then p is hidden layer, executes step 2-3-1-2;If p { C1, C2, C3, C4, C5 }, then p
For convolutional layer, step 2-3-1-3 is executed;If p ∈ { S1, S2, S3, S4 }, then p is down-sampling layer, executes step 2-3-1-4;If p
=F1, then p is classifier layer, executes step 2-3-1-5.This output characteristic pattern tested is denoted as a in test processC",
Middle C ∈ { C1, C2, C3, C4, C5 }, aC" initial value be null matrix;
Step 2-3-1-2 handles hidden layer:There is p=p at this timeH,pH∈ { H1, H2, H3, H4, H5 } is divided into two kinds of feelings at this time
Condition:
Work as pHWhen ∈ { H1, H2, H3, H4 }, calculating p firstHV-th of output characteristic pattern of layerPass through zero pixel filling
By aC" in corresponding characteristic pattern (if pH=H1, then C=C1) width expand toIt is again that it is corresponding with this layer
Convolution nuclear phase convolution, convolution results are summed, and summed result adds pHV-th of offset parameter of layerIt is activated by ReLU
Function processing, obtainsCalculation formula is as follows:
In above formula, Expand_Zero () indicates zero extended function,For pHU-th of the input feature vector figure and v of layer
The corresponding convolution kernel of a output characteristic pattern, mh are the input feature vector figure number of current hidden layer,Indicate pHU-th of layer
Input feature vector figure,Value determine and have by the width of input feature vector figure and the size of convolution kernel
Work as pHWhen=H5, H5 layers of v-th of output characteristic pattern is calculated firstBy zero pixel filling by aC5" feature
Figure resolution ratio is expanded to 10 × 10, then the corresponding weighting parameter of itself and this layer is multiplied, and calculated result is summed, summed result adds
V-th of offset parameter of upper H5 layerIt handles, obtains by ReLU activation primitiveCalculation formula is as follows:
In above formula,For H5 layers of u-th of input feature vector figure weighting parameter corresponding with v-th of output characteristic pattern;
Successively calculate pHAll output characteristic pattern of layer, obtainsP is updated to l+1, and return step 2-3-1-1 sentences
Circuit network type carries out the operation of next network layer;
Step 2-3-1-3 handles convolutional layer:There is p=p at this timeC,pC∈ { C1, C2, C3, C4, C5 }, first calculating pCLayer
V-th of output characteristic patternBy pCThe input feature vector figure convolution nuclear phase convolution corresponding with this layer respectively of layer, convolution results are asked
P is added with, summed resultCV-th of offset parameter of layerIt handles, obtains using ReLU activation primitiveIt calculates public
Formula is as follows:
In above formula,For pCU-th of input feature vector figure convolution kernel corresponding with v-th of output characteristic pattern of layer, mc are
The input feature vector figure number of convolutional layer,Indicate pCU-th of input feature vector figure of layer, while being also pC- 1 layer of u-th of output
Characteristic pattern, * representing matrix convolution, if pC=C1, then pC- 1 layer is input layer.
Successively calculate pCAll output characteristic pattern of layer, obtainsWithValue update aC”(pC=C, such as work as pC=
When C1, then a is usedC1Update aC1"), p is updated to p+1, network type is judged for simultaneously return step 2-3-1-3, carries out next
The operation of network layer;
Step 2-3-1-4 handles down-sampling layer:There is p=p at this timeS,pS∈ { S1, S2, S3, S4 }, step 2-3-1-3 is obtained
The output characteristic pattern of the convolutional layer arrived respectively withPhase convolution, then sampled with step-length for 2, sampling obtains pS
The output characteristic pattern of layerCalculation formula is as follows:
Wherein, Sample () indicates that step-length is 2 sampling processing, pS- 1 indicates the previous convolution of current down-sampling layer
Layer,Indicate pSThe output characteristic pattern of layerIn j-th of output characteristic pattern, obtain pSThe output characteristic pattern of layerAfterwards, by p
It is updated to p+1, and return step 2-3-1-1 judges network type, carries out the operation of next network layer;
Step 2-3-1-4 calculates F1 layers of probability vector:If network layer p is classifier layer, i.e. p=F1 is become by matrix
Change, by the 32 width resolution ratio of C5 be 4 × 4 output characteristic pattern with column sequential deployment, obtain the F1 layer that resolution ratio is 512 × 1
Export feature vectorThen calculate separately horizontal parameters matrix W H, Vertical Parameters matrix W V withApposition, by calculate tie
Fruit is summed with Horizontal offset parameter BH, vertical off setting parameter BV respectively, and summed result obtains level after the processing of Softmax function
Probability vector HPVtest, vertical probability vector VPVtest, calculation formula is as follows:
By its vertical probability vector VPVtestTransposition obtains final vertical probability vector;
Step 2-3-2 includes the following steps:
Step 2-3-2-1 predicts DC1 layers of vertical direction:By last width input picture of input layer and vertical probability to
Measure VPVtestPhase convolution obtains the DC1 layer that resolution ratio is 240 × 280 and exports characteristic pattern
Step 2-3-2-2 predicts DC2 layers of vertical direction:Step 2-3-2-1 is obtainedWith level probability vector
HPVtestPhase convolution, obtains the final extrapolated image of RDCNN, and resolution ratio is 240 × 240.
Claims (10)
1. the Radar Echo Extrapolation model training method based on circulation dynamic convolutional neural networks, which is characterized in that including following
Step:
Step 1, data prediction:Training image collection is inputted, standardization processing is carried out to every piece image that training image is concentrated,
It converts every piece image to 280 × 280 gray level image, obtains image collection, gray level image set is divided, construct
Training sample set comprising TrainsetSize group sample;
Step 2, RDCNN is initialized:RDCNN structure is designed, the circulation dynamic sub-network subnet of generating probability vector is configured to
Network RDSN and probabilistic forecasting layer PPL for predicting future time instance radar return, provides the initial of RDCNN for off-line training step
Change model;
Step 3, the training parameter of RDCNN is initialized:Enable e-learning rate λ=0.0001, the sample that the training stage inputs every time
Quantity BatchSize=10, most large quantities of frequency of training of training sample setCurrent batch of training
Number BatchNum=1, the maximum number of iterations IterationMax=40 of network training, current iteration number
IterationNum=1;
Step 4, training sample is read:By the way of batch training, the training sample that training is obtained from step 1 every time, which is concentrated, to be read
BatchSize group training sample, every group of training sample include 5 width image { x1,x2,x3,x4, y }, wherein { x1,x2,x3,x4Conduct
Input image sequence, y are corresponding control label;
Step 5, propagated forward:The feature that input image sequence is extracted in RDSN obtains level probability vector HPV and vertical general
Rate vector VPV;In probabilistic forecasting layer, by the last piece image in input image sequence successively with VPV, HPV phase convolution, obtain
To the output forecast image of propagated forward;
Step 6, backpropagation:The error term that probability vector is acquired in PPL, further according to probability vector error term from rear to preceding
The error term of each network layer in RDSN is successively calculated, and then calculates error term in each network layer and weighting parameter and biasing is joined
Several gradients utilizes obtained gradient updating network parameter;
Step 7, off-line training step controls:Whole control is carried out to the offline neural metwork training stage, is divided into following three kinds of feelings
Condition:
If training sample is concentrated there are still original training sample, i.e. BatchNum < BatchMax, then return step 4 after
It resumes studies and takes BatchSize group training sample, carry out network training;
If training sample, which is concentrated, is not present original training sample, i.e. BatchNum=BatchMax, and current network changes
Generation number is less than maximum number of iterations, i.e. IterationNum < IterationMax then enables BatchNum=1, return step 4
Continue to read BatchSize group training sample, carries out network training;
If training sample, which is concentrated, is not present original training sample, i.e. BatchNum=BatchMax, and network iteration time
Number reaches maximum number of iterations, i.e. IterationNum=IterationMax, then terminates RDCNN off-line training step, obtain
Convergent RDCNN model.
2. the method according to claim 1, wherein step 1 includes the following steps:
Step 1-1, sampling:Training image collection is inputted, the image that training image is concentrated is sequentially arranged, and between waiting the times
Every distribution, time interval is 6 minutes, altogether includes NTrainWidth image determines TrainsetSize by following formula:
Wherein, Mod (NTrain, 4) and indicate NTrainTo 4 modulus,Expression is not more thanMaximum integer, acquire
After TrainsetSize, training image is retained by sampling and concentrates preceding 4 × TrainsetSize+1 width image, when sampling, which passes through, deletes
Except training image concentrates last image to meet the requirements amount of images;
Step 1-2, normalized images:Image transformation is carried out to the image that sampling obtains, original resolution is by normalization operation
2000 × 2000 color image is converted into the gray level image that resolution ratio is 280 × 280;
Step 1-3 constructs training sample set:Training sample set is constructed using the gray level image that step 1-2 is obtained, by gray level image
Every four adjacent images are concentrated, i.e., { 4N+1,4N+2,4N+3,4N+4 } width image is as one group of list entries, [4 × (N
+ 1)+1] width image is by cutting, and control label of the part that the central resolution ratio of reservation is 240 × 240 as corresponding sample is right
In N group sampleIts make is as follows:
In above formula, G4N+1Indicate gray level image concentrate 4N+1 width image, N is positive integer, and have N ∈ [0,
TrainsetSize-1], Crop () indicates trimming operation, and the portion that original image center size is 240 × 240 is retained after cutting
Point, finally obtain the training sample set comprising TrainsetSize group training sample.
3. according to the method described in claim 2, it is characterized in that, step 1-2 includes the following steps:
Step 1-2-1, image conversion:Gray level image is converted by the step 1-1 image sampled, it is original by cutting reservation
Image resolution ratio boil down to 280 × 280 after cutting is obtained resolution ratio by the part that image center resolution ratio is 560 × 560
For 280 × 280 grayscale image;
Step 1-2-2, data normalization:By the value of each of the grayscale image obtained in step 1-2-1 pixel from [0~
255] it is mapped to [0~1].
4. according to the method described in claim 3, it is characterized in that, step 2 includes the following steps:
Step 2-1, construction circulation dynamic sub-network RDSN:
Sub-network is made of 10 network layers, is followed successively by convolutional layer C1, down-sampling layer S1, hidden layer H1, convolutional layer from front to back
C2, it down-sampling layer S2, hidden layer H2, convolutional layer C3, down-sampling layer S3, hidden layer H3, convolutional layer C4, down-sampling layer S4, implies
Layer H4, convolutional layer C5, hidden layer H5 and classifier layer F1;
Step 2-2 constructs probabilistic forecasting layer PPL:
Dynamic convolutional layer DC1 and dynamic convolutional layer DC2 is constructed in probabilistic forecasting layer, the vertical probability vector VPV that RDSN is exported
As the convolution kernel of dynamic convolutional layer DC1, convolution kernel of the level probability vector HPV as dynamic convolutional layer DC2.
5. according to the method described in claim 4, it is characterized in that, step 2-1 includes the following steps:
Step 2-1-1 constructs convolutional layer:For convolutional layer lC,lC∈ { C1, C2, C3, C4, C5 }, determines the following contents:Convolutional layer
Output characteristic pattern quantityConvolution kernelAnd offset parameterFor convolution kernel, it is thus necessary to determine that volume
The width of product coreThe quantity of convolution kernelThe value is that convolutional layer inputs and exports characteristic pattern number
The product of amount, and convolution kernel is constructed according to Xavier initial method;For offset parameter, the output characteristic pattern of quantity and this layer
Quantity is identical;lCLayer output characteristic pattern width be Value by convolutional layer lCInput feature vector figure
The width of resolution ratio and convolution kernelIt codetermines, i.e., Indicate convolutional layer lCUpper one layer of convolutional layer output characteristic pattern width;
For convolutional layer C1, C1 layers of output characteristic pattern quantity OutputMaps is enabledC1The width of=12, C1 layers of output characteristic pattern
OutputSizeC1=272, C1 layers of convolution kernel width KernelSizeC1=9, C1 layers of offset parameter biasC1It is initialized as zero,
C1 layers of convolution kernel kC1Quantity KernelNumberC1=48, the initial value of each parameter is in convolution kernelRand () is for generating random number;
For convolutional layer C2, C2 layers of output characteristic pattern quantity OutputMaps are enabledC2The width of=32, C2 layers of output characteristic pattern
OutputSizeC2=128, C2 layers of convolution kernel width KernelSizeC2=9, C2 layers of offset parameter are initialized as zero, C2 layers
Convolution kernel kC2Quantity KernelNumberC2=384, the initial value of each parameter is in convolution kernel
For convolutional layer C3, C3 layers of output characteristic pattern quantity OutputMaps are enabledC3The width of=32, C3 layers of output characteristic pattern
OutputSizeC3=56, C3 layers of convolution kernel width KernelSizeC3=9, C3 layers of offset parameter are initialized as zero, C3 layers
Convolution kernel kC3Quantity KernelNumberC3=1024, the initial value of each parameter is in convolution kernel
For convolutional layer C4, C4 layers of output characteristic pattern quantity OutputMaps are enabledC4The width of=32, C4 layers of output characteristic pattern
OutputSizeC4=20, C4 layers of convolution kernel width KernelSizeC4=9, C4 layers of offset parameter are initialized as zero, C4 layers
Convolution kernel kC4Quantity KernelNumberC4=1024, the initial value of each parameter is in convolution kernel
For convolutional layer C5, C5 layers of output characteristic pattern quantity OutputMaps are enabledC5The width of=32, C5 layers of output characteristic pattern
OutputSizeC5=4, C5 layers of convolution kernel width KernelSizeC5=7, C5 layers of offset parameter are initialized as zero, C5 layers
Convolution kernel kC5Quantity KernelNumberC5=1024, the initial value of each parameter is in convolution kernel
Step 2-1-2 constructs hidden layer:For hidden layer lH,lH∈ { H1, H2, H3, H4, H5 }, determines the following contents:Hidden layer
Output characteristic pattern quantityConvolution kernelAnd offset parameterFor convolution kernel, it is thus necessary to determine that volume
The width of product coreThe quantity of convolution kernelIts value is that hidden layer inputs and exports characteristic pattern
The product of quantity, and convolution kernel is constructed according to Xavier initial method;For offset parameter, the output of quantity and hidden layer
Characteristic pattern quantity is identical;lHLayer output characteristic pattern width be It is inputted with corresponding convolutional layer special
The width for levying figure is consistent;
For hidden layer H1, H1 layers of output characteristic pattern quantity OutputMaps is enabledH1The width of=4, H1 layers of output characteristic pattern
OutputSizeH1=280, H1 layers of convolution kernel width KernelSizeH1=9, H1 layers of offset parameter biasH1It is initialized as zero, H1
The convolution kernel k of layerH1Quantity KernelNumberH1=48, the initial value of each parameter is in convolution kernelRand () is for generating random number;
For hidden layer H2, H2 layers of output characteristic pattern quantity OutputMaps are enabledH2The width of=8, H2 layers of output characteristic pattern
OutputSizeH2=136, H2 layers of convolution kernel width KernelSizeH2=9, H2 layers of offset parameter are initialized as zero, H2 layers
Convolution kernel kH2Quantity KernelNumberH2=256, the initial value of each parameter is in convolution kernel
For hidden layer H3, H3 layers of output characteristic pattern quantity OutputMaps are enabledH3The width of=8, H3 layers of output characteristic pattern
OutputSizeH3=64, H3 layers of convolution kernel width KernelSizeH3=9, H3 layers of offset parameter are initialized as zero, H3 layers
Convolution kernel kH3Quantity KernelNumberH3=256, the initial value of each parameter is in convolution kernel
For hidden layer H4, H4 layers of output characteristic pattern quantity OutputMaps are enabledH4The width of=8, H4 layers of output characteristic pattern
OutputSizeH4=28, H4 layers of convolution kernel width KernelSizeH4=9, H4 layers of offset parameter are initialized as zero, H4 layers
Convolution kernel kH4Quantity KernelNumberH4=256, the initial value of each parameter is in convolution kernel
For hidden layer H5, H5 layers of output characteristic pattern quantity OutputMaps are enabledH5The width of=8, H5 layers of output characteristic pattern
OutputSizeH5=10, H5 layers of offset parameter are initialized as zero, H5 layers and include 256 weighting parameter kH5, each weight ginseng
Several initial values are
Step 2-1-3 constructs down-sampling layer:The parameter for needing training is not included in down-sampling layer, by down-sampling layer S1, S2, S3
It is initialized as with the sampling core of S4For down-sampling layer lS,lS∈ { S1, S2, S3, S4 } exports feature
Figure quantityIt is consistent with the output characteristic pattern quantity of one layer of convolutional layer thereon, exports characteristic pattern widthIt is the 1/2 of the output characteristic pattern width of one layer of convolutional layer thereon, formula is expressed as follows:
Step 2-1-4, structural classification device layer:Classifier layer is made of a full articulamentum F1, and F1 layers of weighting parameter is level
Weighting parameter matrix W H and vertical weighting parameter matrix W V, size are 41 × 512, and each of weighting parameter matrix is enabled to join
Several initial values areOffset parameter is Horizontal offset parameter BH and vertical off setting parameter BV, initially
Turn to 41 × 1 one-dimensional null vector.
6. according to the method described in claim 4, it is characterized in that, step 5 includes the following steps:
Step 5-1, RDSN calculate probability vector:Input is extracted by the alternate treatment of convolutional layer and down-sampling layer in a sub-network
Image sequence characteristic, handled in classifier layer by Softmax function, obtain level probability vector HPV and vertical probability
Vector VPV;
Step 5-2, PPL export forecast image:Convolution kernel of the HPV and VPV that step 5-1 is obtained as probabilistic forecasting layer, will be defeated
Enter the last piece image in image sequence successively with VPV, HPV phase convolution, obtains the output forecast image of propagated forward.
7. according to the method described in claim 5, it is characterized in that, step 5-1 includes the following steps:
Step 5-1-1, judges network layer type:Indicate the network layer in current RDSN with l, the value of l be followed successively by H1, C1,
S1, H2, C2, S2, H3, C3, S3, H4, C4, S4, H5, C5, F1 }, initial value H1;The type of network layer l is judged, if l ∈
{ H1, H2, H3, H4, H5 }, then l is hidden layer, executes step 5-1-2;If l ∈ { C1, C2, C3, C4, C5 }, then l is convolutional layer,
Execute step 1-5-1-3;If l ∈ { S1, S2, S3, S4 }, then l is down-sampling layer, executes step 5-1-4;If l=F1, l are
Classifier layer executes step 5-1-5;The output characteristic pattern of this training convolutional layer is denoted as a in training processC', wherein C ∈
{ C1, C2, C3, C4, C5 }, aC' initial value be null matrix;
Step 5-1-2 handles hidden layer:There is l=l at this timeH,lH∈ { H1, H2, H3, H4, H5 } is divided into two kinds of situations:
Work as lHWhen ∈ { H1, H2, H3, H4 }, calculating l firstHJ-th of output characteristic pattern of layerIf lH=H1, then C=C1, leads to
Zero passage pixel filling is by aC' in corresponding characteristic pattern width expand toAgain by the corresponding convolution kernel of itself and this layer
Phase convolution, convolution results are summed, and summed result adds lHJ-th of offset parameter of layerAt ReLU activation primitive
Reason, obtains lHJ-th of output characteristic pattern of layerCalculation formula is as follows:
In above formula, Expand_Zero () indicates zero extended function,For lHI-th of input feature vector figure of layer and j-th of output
The corresponding convolution kernel of characteristic pattern,For lHJ-th of biasing of layer, nh are the input feature vector figure number of current hidden layer,Indicate lHI-th of input feature vector figure of layer,Value by input feature vector figure width and convolution kernel it is big
Small decision, and have
Work as lHWhen=H5, H5 layers of j-th of output characteristic pattern is calculated firstBy zero pixel filling by aC5' characteristic pattern point
Resolution is expanded to 10 × 10, then the corresponding weighting parameter of itself and this layer is multiplied, and calculated result is summed, summed result adds H5
J-th of offset parameter of layerIt handles, obtains by ReLU activation primitiveCalculation formula is as follows:
In above formula,For H5 layers of i-th of input feature vector figure weighting parameter corresponding with j-th of output characteristic pattern;
Successively calculate lHAll output characteristic pattern of layer, obtains lHThe output characteristic pattern of layerL is updated to l+1, and returns to step
Rapid 5-1-1 judges network type, carries out the operation of next network layer;
Step 5-1-3 handles convolutional layer:There is l=l at this timeC,lC∈ { C1, C2, C3, C4, C5 }, first calculating lCJ-th of layer
Export characteristic patternBy lCThe input feature vector figure convolution nuclear phase convolution corresponding with this layer respectively of layer, convolution results are summed, are asked
L is added with resultCJ-th of offset parameter of layerIt handles, obtains using ReLU activation primitiveCalculation formula is as follows
It is shown:
In above formula,For lCI-th of input feature vector figure convolution kernel corresponding with j-th of output characteristic pattern of layer, nc is convolution
The input feature vector figure number of layer,Indicate lCI-th of input feature vector figure of layer, while being also lC- 1 layer of i-th of output feature
Figure, * representing matrix convolution, if lC=C1, then lC- 1 layer is input layer;
Successively calculate lCAll output characteristic pattern of layer, obtains lCThe output characteristic pattern of layerWithValue update aC', more by l
Newly it is l+1, judges network type for simultaneously return step 1-5-1-1, carry out the operation of next network layer;
Step 5-1-3 handles down-sampling layer:There is l=l at this timeS,lS∈ { S1, S2, S3, S4 }, the convolution that step 5-1-2 is obtained
Layer output characteristic pattern respectively withPhase convolution, then sampled with step-length for 2, sampling obtains lSThe output of layer
Characteristic patternCalculation formula is as follows:
In above formula, Sample () indicates that step-length is 2 sampling processing, lS- 1 indicates the previous convolutional layer of current down-sampling layer,Indicate lSThe output characteristic pattern of layerIn j-th of output characteristic pattern, obtain lSThe output characteristic pattern of layerAfterwards, l is updated
For l+1, and return step 5-1-1 judges network type, carries out the operation of next network layer;
Step 5-1-4 calculates F1 layers of probability vector:Have l=F1 at this time, by matrixing, by the 32 width resolution ratio of C5 be 4 ×
4 output characteristic pattern obtains the output feature vector a for the F1 layer that resolution ratio is 512 × 1 with column sequential deploymentF1, calculate separately water
Equal rights value parameter matrix W H and aF1Apposition, vertical weighting parameter matrix W V and aF1Apposition, by calculated result respectively with level
Offset parameter BH, vertical off setting parameter BV summation obtain level probability vector HPV and vertical general after the processing of Softmax function
Rate vector VPV, specific formula for calculation are as follows:
By its vertical probability vector VPV transposition, final vertical probability vector is obtained.
8. the method according to the description of claim 7 is characterized in that step 5-2 includes the following steps:
Step 5-2-1 predicts DC1 layers of vertical direction:By last width input picture of input layer and vertical probability vector VPV phase
Convolution obtains the DC1 layer that resolution ratio is 240 × 280 and exports characteristic pattern aDC1;
Step 5-2-2 predicts DC2 layers of vertical direction:By DC1 layers of output characteristic pattern aDC1With level probability vector HPV phase convolution,
The output forecast image of propagated forward is obtained, resolution ratio is 240 × 240.
9. according to the method described in claim 8, it is characterized in that, step 6 includes the following steps:
Step 6-1 calculates PPL error term:By in the step 5-2-2 forecast image obtained and the training sample of input to sighting target
Label ask poor, calculate the error term of DC2 layers, DC1 layers, finally acquire the error term δ of level probability vectorHPVWith vertical probability vector
Error term δVPV;
Step 6-2 calculates RDSN error term:According to the error term δ of level probability vectorHPVWith the error term of vertical probability vector
δVPV, from it is rear to preceding successively calculate classification layer F1, convolutional layer (C5, C4, C3, C2, C1) hidden layer (H5, H4, H3, H2, H1) and under
The error term of sample level (S4, S3, S2, S1), the resolution ratio of any layer error term matrix acquired and the output characteristic pattern of this layer
Resolution ratio it is consistent;
Step 6-3 calculates gradient:The error term obtained according to step 6-2 calculates the error term of each network layer in RDSN to this
The gradient value of layer weighting parameter and offset parameter;
Step 6-4, undated parameter:The step 6-3 weighting parameter of each network layer obtained and the gradient value of offset parameter are multiplied by
The learning rate of RDCNN obtains the update item of each network layer weighting parameter and offset parameter, by former weighting parameter and offset parameter point
It does not ask poor with the update item, obtains updated weighting parameter and offset parameter.
10. according to the method described in claim 9, it is characterized in that, step 6-1 includes the following steps:
Step 6-1-1 calculates dynamic convolutional layer DC2 error term:Pair of forecast image and this group of sample that step 5-2-2 is obtained
Sighting target label ask poor, obtain the error term matrix delta that size is 240 × 240DC2;
Step 6-1-2 calculates dynamic convolutional layer DC1 error term:By zero padding by DC2 layers of error term matrix deltaDC2Expansion is
240 × 320, by level probability Vector rotation 180 degree, by the error term matrix after expansion and the level probability vector phase after overturning
Convolution obtains DC1 layers of error term matrix deltaDC1, size is 240 × 280, and formula is as follows:
δDC1=Expand_Zero (δDC2) * rot180 (HPV),
In above formula, rot180 () indicates that angle is 180 ° of rotation function, and 2 × 2 matrix zero is extended for 4 × 4 matrix,
Matrix after zero expansion, the region that central resolution ratio is 2 × 2 is consistent with original matrix, zero pixel filling of remaining position;
Step 6-1-3 calculates probability vector error term:The error term for calculating level probability vector HPV, DC1 layers of output is special
Sign figure and error term matrix deltaDC2Phase convolution obtains 1 × 41 row vector after convolution, which is the error term δ of HPVHPV, formula
It is as follows:
δHPV=aDC1*δDC2,
The error term for calculating vertical probability vector VPV, by the input feature vector figure of input layer and error term matrix deltaDC1Phase convolution, volume
41 × 1 column vector is obtained after product, which is the error term δ of VPVVPV, formula is as follows:
In above formula,For the last piece image in the input image sequence of training sample;
Step 6-2 includes the following steps:
Step 6-2-1 calculates classifier layer F1 error term:By the error term δ of the step 6-1-3 probability vector obtainedVPVAnd δHPV
Weighting parameter matrix W V vertical with F1 layers and horizontal weighting parameter matrix W H carry out matrix multiple respectively, then by the apposition of matrix
It sums and is averaged, obtain F1 layers of error term δF1, formula is as follows:
In above formula, × representing matrix apposition, ()TThe transposition for representing matrix, obtained δF1Size be 512 × 1;
Step 6-2-2 calculates convolutional layer C5 error term:By matrixing, by the error term of the F1 layer obtained in step 6-2-1
δF1It is transformed to the matrix that 32 resolution ratio are 4 × 4Obtain C5 layers of error term δC5,After indicating transformation
The 32nd resolution ratio be 4 × 4 matrix;
Step 6-2-3, judges network layer type:Indicate that the network layer in the RDSN being presently in, the value of l are followed successively by with l
{ H5, S4, C4, H4, S3, C3, H3, S2, C2, H2, S1, C1, H1 }, l initial value are H5, the type of network layer l are judged, if l ∈
{ H5, H4, H3, H2, H1 }, then l is hidden layer, executes step 6-2-4;If l ∈ { S4, S3, S2, S1 }, then l is down-sampling layer,
Step 6-2-5 is executed, if l ∈ { C4, C3, C2, C1 }, then l is convolutional layer, executes step 6-2-6;
Step 6-2-4 calculates hidden layer error term:L=l at this timeH,lH∈ { H5, H4, H3, H2, H1 } calculates lHI-th of layer
Error term matrixBy zero padding respectively by each error term matrix delta of l+1 layers of convolutional layerl+1It expands to width and is
ExpandSizel+1:
ExpandSizel+1=OutputSizel+1+2·(KernelSizel+1- 1),
Again by corresponding convolution kernel180 degree is rotated, then by the matrix after expansion and the convolution nuclear phase convolution after overturning, and will
Convolution results summation, obtains lHI-th of error term matrix of layerFormula is as follows:
In above formula, nc indicates the error term number of l+1 layers of convolutional layer, and numerical value is identical as l+1 layers of output characteristic pattern quantity, and
There is nc=OutputMapsl+1;
All error term matrixes are successively calculated, l is obtainedHThe output characteristic pattern of layerL is updated to l-1, and return step 6-
2-3 judges network type, calculates the error term of a upper network layer;
Step 6-2-5 calculates down-sampling layer error term:L=l at this timeS,lS∈ { S4, S3, S2, S1 } calculates lSI-th of mistake of layer
Poor item matrixBy zero padding respectively by each error term matrix delta of l+2 layers of convolutional layerl+2It expands to width and is
ExpandSizel+2:
ExpandSizel+2=OutputSizel+2+2·(KernelSizel+2- 1),
Again by corresponding convolution kernel180 degree is rotated, then by the matrix after expansion and the convolution nuclear phase convolution after overturning, and
Convolution results are summed, l is obtainedSI-th of error term matrix of layerFormula is as follows:
In above formula, nc indicates the error term number of l+2 layers of convolutional layer, and numerical value is identical as l+2 layers of output characteristic pattern quantity, and
There is nc=OutputMapsl+2;
All error term matrixes are successively calculated, l is obtainedSThe output characteristic pattern of layerL is updated to l-1, and return step 6-
2-3 judges network type, calculates the error term of a upper network layer;
Step 6-2-6 calculates convolutional layer error term:There is l=l at this timeC,lC∈ { C4, C3, C2, C1 }, due to l in step 6-2-3
Initial value be H5, therefore be not in lCThe case where=C5, for lCI-th of error term matrix of layerFirst under l+1 layers
Corresponding i-th of error term matrix in sample levelIt is up-sampled, it will when up-samplingIn each element error term
It is worth average mark into sampling area, obtaining resolution ratio isUp-sampling matrix, then calculate sharp
Function living is in lCThe inner product of derivative and the up-sampling matrix acquired at layer character pair figure, obtains lCI-th of error term square of layer
Battle arrayFormula is as follows:
In above formula, representing matrix inner product, ReLU'() indicate the derivative of ReLU activation primitive, form is as follows:
UpSamlpe () indicates up-sampling function, the corresponding up-sampling area of each of original image pixel after up-sampling
Domain in each of original pixel value mean allocation to sampling area pixel, successively calculates all error term matrixes, obtains
lCThe output characteristic pattern of layer
Step 6-2-7, l layers are convolutional layer, i.e. l=l at this timeC, it is divided into two kinds of situations later:
If l ≠ C1, l is updated to l-1, and return step 6-2-3 judges network type, calculates the error of a upper network layer
?;
If l=C1, the calculating of step 6-2 sub-network error term terminates;
Step 6-3 includes the following steps:
Step 6-3-1 calculates convolutional layer error term to the gradient of convolution kernel:Use lCIndicate currently processed convolutional layer, lC∈{C1,
C2, C3, C4, C5 }, each convolutional layer error term is successively calculated since C1 layers to the gradient of convolution kernel, it is defeated by i-th of convolutional layer
Enter characteristic patternWith lCJ-th of error term matrix of layerPhase convolution, convolution results are the gradient value of corresponding convolution kernelFormula is as follows:
In above formula,WithRespectively indicate lCThe output characteristic pattern number and l of layerC- 1 layer
Export characteristic pattern number;
Step 6-3-2 calculates each convolutional layer error term to the gradient of biasing:Use lCIndicate currently processed convolutional layer, lC∈{C1,
C2, C3, C4, C5 }, each convolutional layer error term is successively calculated since C1 layers to the gradient of biasing, by lCJ-th of error term of layer
MatrixIn all elements sum, obtain j-th of this layer biasing gradient valueFormula is as follows:
In above formula, Sum () expression sums to all elements of matrix;
Step 6-3-3 calculates hidden layer error term to the gradient of convolution kernel:Use lHIndicate currently processed hidden layer, lH∈{H1,
H2, H3, H4, H5 }, each convolutional layer error term is successively calculated since H1 layers to the gradient of convolution kernel, first to hidden layer error
Item is cut, and is retained central width and isPart, be denoted asWork as lHWhen=H5,
Retain the part that H5 layers of error term center width are 4 × 4, then by i-th of input feature vector figure of hidden layerWithJth
A component phase convolution, convolution results are the gradient value of corresponding convolution kernelFormula is as follows:
In above formula,WithRespectively indicate lHThe output characteristic pattern number and l of layerH- 1 layer
Output characteristic pattern number;
Step 6-3-4 calculates each hidden layer error term to the gradient of biasing:Use lHIndicate currently processed hidden layer, lH∈{H1,
H2, H3, H4, H5 }, each convolutional layer error term is successively calculated since H1 layers to the gradient of biasing, it will be obtained in step 6-3-3All elements in j-th of component are summed, and the gradient value of j-th of this layer biasing is obtainedFormula is as follows:
In above formula, Sum () expression sums to all elements of matrix;
Step 6-3-5 calculates F1 layers of error term to the gradient of weighting parameter:Calculate separately level probability vector and vertical probability to
The error term δ of amountHPV、δVPVWith F1 layers of error term δF1Inner product, calculated result is the ladder of F1 layers of error term to weighting parameter WH, WV
Angle value, formula are as follows:
▽ WH=(δHPV)T×(δF1)T,
▽ WV=δVPV×(δF1)T,
In above formula, ▽ WH is gradient value of the error term to horizontal weighting parameter, and ▽ WV is ladder of the error term to vertical weighting parameter
Angle value;
Step 6-3-6 calculates F1 layers of error term to the gradient of offset parameter:By the mistake of level probability vector and vertical probability vector
Poor item δHPV、δVPVRespectively as F1 layers of error term to the gradient value of Horizontal offset parameter BH and vertical off setting parameter BV, formula is such as
Under:
▽ BH=(δHPV)T,
▽ BV=δVPV,
In above formula, ▽ BH is gradient value of the error term to Horizontal offset parameter, and ▽ BV is ladder of the error term to vertical off setting parameter
Angle value;
Step 6-4 includes the following steps:
Step 6-4-1 updates each convolutional layer weighting parameter:Each convolutional layer error term that step 6-3-1 is obtained is to convolution kernel
Gradient is multiplied by the learning rate of RDCNN, obtains the correction term of convolution kernel, then former convolution kernel and the correction term are asked poor, is updated
Convolution kernelFormula is as follows:
Step 6-4-2 updates each convolutional layer offset parameter:Ladder of each convolutional layer error term that step 6-3-2 is obtained to biasing
Degree is multiplied by the learning rate of RDCNN, obtains the correction term of offset parameter, then former bias term and the correction term are asked poor, is updated
Bias termFormula is as follows:
Step 6-4-3 updates each hidden layer weighting parameter:Each hidden layer error term that step 6-3-3 is obtained is to convolution kernel
Gradient is multiplied by the learning rate of RDCNN, obtains the correction term of convolution kernel, then former convolution kernel and the correction term are asked poor, is updated
Convolution kernelFormula is as follows:
Step 6-4-4 updates each hidden layer offset parameter:Ladder of each hidden layer error term that step 6-3-4 is obtained to biasing
Degree is multiplied by the learning rate of RDCNN, obtains the correction term of offset parameter, then former bias term and the correction term are asked poor, is updated
Bias termFormula is as follows:
Step 6-4-5 updates F1 layers of weighting parameter:The F1 layer error term that step 6-3-5 is obtained is to weighting parameter WH and WV
Gradient value is multiplied by the learning rate of RDCNN, obtain the correction term of weighting parameter, then by former weighting parameter WH and WV respectively with acquire
Correction term asks poor, the WH and WV updated, and formula is as follows:
WH=WH- λ ▽ WH,
WV=WV- λ ▽ WV;
Step 6-4-6 updates F1 layers of offset parameter:The F1 layer error term that step 6-3-6 is obtained is to offset parameter BH and BV
Gradient value is multiplied by the learning rate of RDCNN, obtain the correction term of offset parameter, then by former offset parameter BH and BV respectively with acquire
Correction term asks poor, the BH and BV updated, and formula is as follows:
BH=BH- λ ▽ BH,
BV=BV- λ ▽ BV.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810402200.8A CN108846409A (en) | 2018-04-28 | 2018-04-28 | Radar echo extrapolation model training method based on cyclic dynamic convolution neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810402200.8A CN108846409A (en) | 2018-04-28 | 2018-04-28 | Radar echo extrapolation model training method based on cyclic dynamic convolution neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108846409A true CN108846409A (en) | 2018-11-20 |
Family
ID=64212387
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810402200.8A Pending CN108846409A (en) | 2018-04-28 | 2018-04-28 | Radar echo extrapolation model training method based on cyclic dynamic convolution neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108846409A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110456355A (en) * | 2019-08-19 | 2019-11-15 | 河南大学 | A kind of Radar Echo Extrapolation method based on long short-term memory and generation confrontation network |
CN110568442A (en) * | 2019-10-15 | 2019-12-13 | 中国人民解放军国防科技大学 | Radar echo extrapolation method based on confrontation extrapolation neural network |
CN110705508A (en) * | 2019-10-15 | 2020-01-17 | 中国人民解放军战略支援部队航天工程大学 | Satellite identification method of ISAR image |
CN113421252A (en) * | 2021-07-07 | 2021-09-21 | 南京思飞捷软件科技有限公司 | Actual detection method for vehicle body welding defects based on improved convolutional neural network |
CN115393725A (en) * | 2022-10-26 | 2022-11-25 | 西南科技大学 | Bridge crack identification method based on feature enhancement and semantic segmentation |
CN116184124A (en) * | 2023-04-26 | 2023-05-30 | 华东交通大学 | Power distribution network fault type identification method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170140285A1 (en) * | 2015-11-13 | 2017-05-18 | Microsoft Technology Licensing, Llc | Enhanced Computer Experience From Activity Prediction |
CN106886023A (en) * | 2017-02-27 | 2017-06-23 | 中国人民解放军理工大学 | A kind of Radar Echo Extrapolation method based on dynamic convolutional neural networks |
CN107632295A (en) * | 2017-09-15 | 2018-01-26 | 广东工业大学 | A kind of Radar Echo Extrapolation method based on sequential convolutional neural networks |
-
2018
- 2018-04-28 CN CN201810402200.8A patent/CN108846409A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170140285A1 (en) * | 2015-11-13 | 2017-05-18 | Microsoft Technology Licensing, Llc | Enhanced Computer Experience From Activity Prediction |
CN106886023A (en) * | 2017-02-27 | 2017-06-23 | 中国人民解放军理工大学 | A kind of Radar Echo Extrapolation method based on dynamic convolutional neural networks |
CN107632295A (en) * | 2017-09-15 | 2018-01-26 | 广东工业大学 | A kind of Radar Echo Extrapolation method based on sequential convolutional neural networks |
Non-Patent Citations (1)
Title |
---|
EN SHI等: "A Method of Weather Radar Echo Extrapolation Based on Convolutional Neural Networks"", 《24TH INTERNATIONAL CONFERENCE, MMM(MULTIMEDIA MODELING)2018 PART1》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110456355A (en) * | 2019-08-19 | 2019-11-15 | 河南大学 | A kind of Radar Echo Extrapolation method based on long short-term memory and generation confrontation network |
CN110456355B (en) * | 2019-08-19 | 2021-12-24 | 河南大学 | Radar echo extrapolation method based on long-time and short-time memory and generation countermeasure network |
CN110568442A (en) * | 2019-10-15 | 2019-12-13 | 中国人民解放军国防科技大学 | Radar echo extrapolation method based on confrontation extrapolation neural network |
CN110705508A (en) * | 2019-10-15 | 2020-01-17 | 中国人民解放军战略支援部队航天工程大学 | Satellite identification method of ISAR image |
CN110568442B (en) * | 2019-10-15 | 2021-08-20 | 中国人民解放军国防科技大学 | Radar echo extrapolation method based on confrontation extrapolation neural network |
CN113421252A (en) * | 2021-07-07 | 2021-09-21 | 南京思飞捷软件科技有限公司 | Actual detection method for vehicle body welding defects based on improved convolutional neural network |
CN113421252B (en) * | 2021-07-07 | 2024-04-19 | 南京思飞捷软件科技有限公司 | Improved convolutional neural network-based vehicle body welding defect detection method |
CN115393725A (en) * | 2022-10-26 | 2022-11-25 | 西南科技大学 | Bridge crack identification method based on feature enhancement and semantic segmentation |
CN116184124A (en) * | 2023-04-26 | 2023-05-30 | 华东交通大学 | Power distribution network fault type identification method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106886023A (en) | A kind of Radar Echo Extrapolation method based on dynamic convolutional neural networks | |
CN108846409A (en) | Radar echo extrapolation model training method based on cyclic dynamic convolution neural network | |
CN106203430B (en) | A kind of conspicuousness object detecting method based on foreground focused degree and background priori | |
CN105787439B (en) | A kind of depth image human synovial localization method based on convolutional neural networks | |
CN106355151B (en) | A kind of three-dimensional S AR images steganalysis method based on depth confidence network | |
CN111368769B (en) | Ship multi-target detection method based on improved anchor point frame generation model | |
CN109800628A (en) | A kind of network structure and detection method for reinforcing SSD Small object pedestrian detection performance | |
CN108254741A (en) | Targetpath Forecasting Methodology based on Recognition with Recurrent Neural Network | |
CN110472483A (en) | A kind of method and device of the small sample semantic feature enhancing towards SAR image | |
CN109816012A (en) | A kind of multiscale target detection method of integrating context information | |
CN107480730A (en) | Power equipment identification model construction method and system, the recognition methods of power equipment | |
CN110532894A (en) | Remote sensing target detection method based on boundary constraint CenterNet | |
CN108776779A (en) | SAR Target Recognition of Sequential Images methods based on convolution loop network | |
CN105913025A (en) | Deep learning face identification method based on multiple-characteristic fusion | |
CN107358142A (en) | Polarimetric SAR Image semisupervised classification method based on random forest composition | |
CN110163836A (en) | Based on deep learning for the excavator detection method under the inspection of high-altitude | |
CN109766936A (en) | Image change detection method based on information transmitting and attention mechanism | |
CN112465006B (en) | Target tracking method and device for graph neural network | |
CN107563411A (en) | Online SAR target detection method based on deep learning | |
CN109919045A (en) | Small scale pedestrian detection recognition methods based on concatenated convolutional network | |
CN108447057A (en) | SAR image change detection based on conspicuousness and depth convolutional network | |
CN109598711A (en) | A kind of thermal image defect extracting method based on feature mining and neural network | |
CN104680151B (en) | A kind of panchromatic remote sensing image variation detection method of high-resolution for taking snow covering influence into account | |
CN107229084A (en) | A kind of automatic identification, tracks and predicts contracurrent system mesh calibration method | |
CN112949407A (en) | Remote sensing image building vectorization method based on deep learning and point set optimization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181120 |