CN111369595A

CN111369595A - Optical flow calculation method based on self-adaptive correlation convolution neural network

Info

Publication number: CN111369595A
Application number: CN201910980474.XA
Authority: CN
Inventors: 袁媛; 李昊鹏; 王�琦
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2019-10-15
Filing date: 2019-10-15
Publication date: 2020-07-03

Abstract

The invention provides an optical flow computing method based on an adaptive correlation convolution neural network, aiming at an image pair with any size, an improved adaptive correlation convolution neural network is adopted for pixel matching, and an optical flow graph is obtained through computing, the 1 × 1 convolution is added before the common correlation operation of a FlowNet C model, the correlation among different dimensions of characteristics is excavated, metric learning is integrated into a deep neural network, and the accuracy and the robustness of optical flow computing can be improved without increasing computing time and space consumption.

Description

Optical flow calculation method based on self-adaptive correlation convolution neural network

Technical Field

The invention belongs to the technical field of computer vision and video feature extraction, and particularly relates to an optical flow computing method based on an adaptive correlation convolution neural network. The invention can be applied to the aspects of video motion information extraction, behavior identification and the like.

Background

The optical flow calculation needs to acquire the pixel correspondence between two images, and while the transmission image contains all motion information, the optical flow is applied to various fields, such as motion recognition, video frame interpolation, object tracking, video segmentation and the like. However, accurate optical flow estimation remains a challenge due to motion blur, occlusion, light variation, and large scale displacement.

Existing learning-based optical flow computation methods include methods that utilize conventional machine learning techniques, e.g.Markov random field model, statistical model, stochastic optimization and principal component analysis, and a method for optical flow computation using deep learning techniques. In recent years, convolutional neural networks have become an advanced method for calculating optical flow due to their strong fitting and representation capabilities and end-to-end learning manner. Compared with the traditional method, the convolutional neural network greatly improves the accuracy and efficiency of optical flow estimation. The core of the convolutional neural network to compute optical flow is to find the pixel match between the two images. For this purpose, the documents "AlexeyDosovitskiy, Philipp Fischer, Eddy Ilg, Philip

Flame hazibras, VladimirGolkov, Patrick van der Smagt, Daniel creaters, and Thomas Brox, Learning optical flow with volumetric networks, CVPR,2015 "proposes FlowNetC, in which the relevant layers are used for the image block comparison of two images. Like convolutional, normalization and pooling layers, it is a fully micro-operable and can be implemented into any neural network. For the neural network that calculates the optical flow, the correlation layer has an important meaning. However, there are two main limitations of the relevant layers: 1) it only considers the corresponding relation between the dimension and itself, and the dependency between each other is ignored; 2) the weights of these correspondences are equal and the differences in different dimensions are neglected.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides an optical flow calculation method based on an adaptive correlation convolutional neural network. Aiming at the image pair with any size, the invention carries out pixel matching by utilizing the self-adaptive correlation layer to finally obtain the light flow graph. Compared with the FlowNet C comprising a common correlation layer, the method can improve the accuracy and the robustness of optical flow calculation without increasing the calculation time and the space consumption.

An optical flow computing method based on an adaptive correlation convolution neural network is characterized by comprising the following steps:

step 1, adding 256 1 × 1 convolution kernels before common correlation operation in an original FlowNet C model to obtain an improved self-adaptive correlation convolution network;

step 2: and inputting a data set, and training the improved self-adaptive correlation convolution neural network to obtain the trained self-adaptive correlation convolution neural network.

And step 3: two images with any size are given, and a trained self-adaptive correlation convolution neural network is input to obtain an optical flow graph between the two images.

In the network training described in step 2, the network loss function L is:

wherein S is a scale index, S is a total number of scales, (x, y) represents a pixel coordinate in the image, and w is a color value_sIs the weight of the scale s, W_sIs the width of the optical flow image at the scale s, H_sIs the height of the optical flow image at the scale s,

for the estimated optical flow vector at pixel (x, y),

is the true optical flow vector at pixel (x, y);

the initial learning rate in the training process is 0.0001, the learning rate of each 30 training rounds is reduced by 10 times, and 100 training rounds are trained in total. Network parameters were optimized using a batch adaptive gradient descent algorithm with the batch size set to 8.

The method has the advantages that the conventional FlowNet C model is improved, 1 × 1 convolution is added before common correlation operation, correlation among different dimensionalities of features is mined, metric learning is integrated into a deep neural network, the problems of the FlowNet C algorithm are solved, the improved adaptive correlation convolution neural network is used for optical flow calculation, and the accuracy and the robustness of the optical flow calculation can be improved without increasing calculation time and space consumption.

Drawings

FIG. 1 is a schematic diagram of an adaptive correlation convolutional neural network of the present invention.

Detailed Description

The present invention will be further described with reference to the following drawings and examples, which include, but are not limited to, the following examples.

The invention provides an optical flow calculation method based on an adaptive correlation convolution neural network. The realization process is as follows:

1. improved FlowNet C model

Recorded in the literature "Alexey Dosovitskiy, Philipp Fischer, Eddy Ilg, Philip

The FlowNet C model in the Caner Hazirbas, Vladimir Golkov, Patrick van der Smog, Daniel Cremers, andTimas Brox, Learning optical flow with volumetric network, CVPR, 2015' has a relevant operation layer for comparing the similarity of each position of two optical flow graphs, and can more accurately carry out the end-to-end estimation of the optical flow graphs. But the method has the problem of neglecting the coupling relation among the dimensions of the feature map, so that the optical flow robustness is poor.

In order to overcome the problems, learnable linear mapping can be used for mining the correlation between different dimensions of features, common Euclidean distance is generalized into Mahalanobis distance, so that metric learning is merged into a deep neural network, therefore, 1 × 1 convolution is added before common correlation operation in a model of the method, and the number of convolution kernels is 256.

2. Network training

The step 1 modified FlowNetC model was trained using the published data set. This example uses the Sintel dataset and the FlyingChairs dataset proposed by Dosockey et al in the documents "Daniel J.Butler1, Jonas Wulff2, Garrett B.Stanley3, and Michael J.Black2, A natual office moviee for optical flow evaluation, ECCV 2012", respectively, and the FlyingChairs dataset proposed by Dosockey et al in the documents "Alexey Dosovitskiy, Philippfier, Eddy Ilg, Philip Hausser, Caner Haz1 rbs, and Vladimir Gokov, Flowet: Learning optical flow with connected volumes works, CVPR, 2015", respectively, divided into training and test sets according to literature methods, to adjust the noise in the original dataset and to verify the validity of the method of the invention, and add random noise to the images.

The training process employs the following multi-scale endpoint error loss function L:

for the estimated optical flow vector at pixel (x, y),

is the true optical flow vector at pixel (x, y). The learning rate in the training process is 0.0001, each 30 training rounds are reduced by 10 times, and 100 training rounds are trained in total. Network parameters were optimized using a batch adaptive gradient descent algorithm with the batch size set to 8.

3. Computing an optical flow graph

Inputting two images with given arbitrary size into a trained self-adaptive correlation convolution neural network to obtain an optical flow graph between the two images. PyTorch software is adopted to realize the steps, and the specific steps are as follows:

step 1: any two frames of image I in given video¹,I²∈R^3×H×WWhere H, W are the height and width of the image, respectively, and the "convolution-activation" operation is performed three times as follows to obtain feature maps out _ conv3a and out _ conv3b, respectively:

out_conv1a＝conv1(I¹)

out_conv2a＝conv2(out_conv1a)

out_conv3a＝conv3(out_conv2a)

out_conv1b＝conv1(I²)

out_conv2b＝conv2(out_conv1b)

out_conv3b＝conv3(out_conv2b)

wherein, conv1 (-), conv2 (-), and conv3 (-), are convolution-activation operation functions respectively.

Step 2: performing adaptive correlation operation on the feature maps out _ conv3a and out _ conv3b to obtain a feature map out _ correlation:

out_conv3a_＝conv(out_conv3a)

out_conv3b_＝conv(out_conv3b)

out_correlation＝corr(out_conv3a_,out_conv3b_)

where conv (-) is a convolution function of 1 × 1 and corr (-) is a correlation operation function.

And step 3: performing convolution-activation operation on the feature map out _ conv3a, and then performing concatenation with the feature map out _ convolution to obtain a concatenated feature map in _ conv3_ 1:

out_conv_redir＝conv_redir(out_conv3a)

in_conv3_1＝cat(out_conv_redir,out_correlation)

where conv _ redir (-) is the "convolution-activate" operation function, and cat (-) is the channel concatenation operation function.

And 4, step 4: performing a series of convolution-activation operations on the feature map in _ conv3_1 to obtain feature maps out _ conv3, out _ conv4, out _ conv5 and out _ conv6, respectively, that is:

out_conv3＝conv3_1(in_conv3_1)

out_conv4＝conv4_1(conv4(out_conv3))

out_conv5＝conv5_1(conv5(out_conv4))

out_conv6＝conv6_1(conv6(out_conv5))

wherein, conv4 (-), conv5 (-), conv6 (-), conv4_1 (-), conv5_1 (-), and conv6_1 (-), are respectively "convolution-activation" operation functions.

And 5: the feature map out _ conv6 is subjected to convolution-activation operation and concatenation operation, resulting in feature maps flow6, flow6_ up, out _ deconv5, namely:

flow6＝predict_flow6(out_conv6)

flow6_up＝cat(upsampled_flow6_to_5(flow6),out_conv5)

out_deconv5＝cat(deconv5(out_conv6),out_conv5)

wherein, predict _ flow6 (-), upsampled _ flow6_ to _5 (-), and deconv5 (-), are "convolution-activation" operation functions, respectively.

Step 6: the following operations are carried out on the characteristic maps to obtain characteristic maps concat5, flow5, flow5_ up and out _ deconv4 respectively:

concat5＝cat(out_conv5,out_deconv5,flow6_up)

flow5＝predict_flow5(concat5)

flow5_up＝cat(upsampled_flow5_to_4(flow5),out_conv4)

out_deconv4＝cat(deconv4(concat5),out_conv4)

wherein, predict _ flow5 (-), upsampled _ flow5_ to _4 (-), and deconv4 (-), are "convolution-activation" operation functions, respectively.

And 7: the following operations are carried out on the characteristic maps to obtain characteristic maps concat4, flow4, flow4_ up and out _ deconv3 respectively:

concat4＝cat(out_conv4,out_deconv4,flow5_up)

flow4＝predict_flow4(concat4)

flow4_up＝cat(upsampled_flow4_to_3(flow4),out_conv3)

out_deconv3＝cat(deconv3(concat4),out_conv3)

wherein, predict _ flow4 (-), upsampled _ flow4_ to _3 (-), and deconv3 (-), are "convolution-activation" operation functions, respectively.

And 8: the following operations are carried out on the characteristic maps to obtain characteristic maps concat3, flow3, flow3_ up and out _ deconv2 respectively:

concat3＝cat(out_conv3,out_deconv3,flow4_up)

flow3＝predict_flow3(concat3)

flow3_up＝cat(upsampled_flow3_to_2(flow3),out_conv2a)

out_deconv2＝cat(deconv2(concat3),out_conv2a)

wherein, predict _ flow3 (-), upsampled _ flow3_ to _2 (-), and deconv2 (-), are "convolution-activation" operation functions, respectively.

And step 9: the final optical flow map flow2 is obtained by performing the following operations on the characteristic map:

concat2＝cat(out_conv2a,out_deconv2,flow3_up)

flow2＝predict_flow2(concat2)

where predict _ flow2(·) is the "convolution-activate" operation function.

In order to verify the effect of the method of the present invention, the CPU is

The method comprises the steps of respectively carrying out simulation experiments on three data sets of Sintel Clean, Sintel Final and flight curves by using Python software and a PyTorch deep learning framework on an i7-6800K 3.40GHz CPU, an NVIDIAGeForce GTX 1080GPU and an Ubuntu operating system. The optical flow calculation method using the FlowNetC model and the optical flow calculation method of the present invention were compared, and the average endpoint error, the running time, and the model size of the methods were calculated, respectively, and the calculation results are shown in table 1. In the table, Sintel Clean, Sintel Final, and folding Chairs represent three original data sets, respectively, with the suffix "_ L" being a data set of increasing luminance change and "_ N" being a data set of increasing noise. According to the calculation result data, the method has higher calculation accuracy under the condition that the test time and the model size are hardly increased; for the data set with increased brightness variation and noise, the method of the invention has better performance and higher robustness. In conclusion, the method has high calculation accuracy and robustness and strong practicability.

TABLE 1

Claims

1. An optical flow computing method based on an adaptive correlation convolution neural network is characterized by comprising the following steps:

2. The optical flow computation method based on the adaptive correlation convolutional neural network as claimed in claim 1, wherein: in the network training described in step 2, the network loss function L is:

for the estimated optical flow vector at pixel (x, y),

is the true optical flow vector at pixel (x, y);

the initial learning rate in the training process is 0.0001, the learning rate of each 30 training rounds is reduced by 10 times, and 100 training rounds are trained; network parameters were optimized using a batch adaptive gradient descent algorithm with the batch size set to 8.