CN111369595A - Optical flow calculation method based on self-adaptive correlation convolution neural network - Google Patents

Optical flow calculation method based on self-adaptive correlation convolution neural network Download PDF

Info

Publication number
CN111369595A
CN111369595A CN201910980474.XA CN201910980474A CN111369595A CN 111369595 A CN111369595 A CN 111369595A CN 201910980474 A CN201910980474 A CN 201910980474A CN 111369595 A CN111369595 A CN 111369595A
Authority
CN
China
Prior art keywords
optical flow
neural network
adaptive
adaptive correlation
convolution neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910980474.XA
Other languages
Chinese (zh)
Inventor
袁媛
李昊鹏
王�琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN201910980474.XA priority Critical patent/CN111369595A/en
Publication of CN111369595A publication Critical patent/CN111369595A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an optical flow computing method based on an adaptive correlation convolution neural network, aiming at an image pair with any size, an improved adaptive correlation convolution neural network is adopted for pixel matching, and an optical flow graph is obtained through computing, the 1 × 1 convolution is added before the common correlation operation of a FlowNet C model, the correlation among different dimensions of characteristics is excavated, metric learning is integrated into a deep neural network, and the accuracy and the robustness of optical flow computing can be improved without increasing computing time and space consumption.

Description

Optical flow calculation method based on self-adaptive correlation convolution neural network
Technical Field
The invention belongs to the technical field of computer vision and video feature extraction, and particularly relates to an optical flow computing method based on an adaptive correlation convolution neural network. The invention can be applied to the aspects of video motion information extraction, behavior identification and the like.
Background
The optical flow calculation needs to acquire the pixel correspondence between two images, and while the transmission image contains all motion information, the optical flow is applied to various fields, such as motion recognition, video frame interpolation, object tracking, video segmentation and the like. However, accurate optical flow estimation remains a challenge due to motion blur, occlusion, light variation, and large scale displacement.
Existing learning-based optical flow computation methods include methods that utilize conventional machine learning techniques, e.g.Markov random field model, statistical model, stochastic optimization and principal component analysis, and a method for optical flow computation using deep learning techniques. In recent years, convolutional neural networks have become an advanced method for calculating optical flow due to their strong fitting and representation capabilities and end-to-end learning manner. Compared with the traditional method, the convolutional neural network greatly improves the accuracy and efficiency of optical flow estimation. The core of the convolutional neural network to compute optical flow is to find the pixel match between the two images. For this purpose, the documents "AlexeyDosovitskiy, Philipp Fischer, Eddy Ilg, Philip
Figure BDA0002233536930000011
Flame hazibras, VladimirGolkov, Patrick van der Smagt, Daniel creaters, and Thomas Brox, Learning optical flow with volumetric networks, CVPR,2015 "proposes FlowNetC, in which the relevant layers are used for the image block comparison of two images. Like convolutional, normalization and pooling layers, it is a fully micro-operable and can be implemented into any neural network. For the neural network that calculates the optical flow, the correlation layer has an important meaning. However, there are two main limitations of the relevant layers: 1) it only considers the corresponding relation between the dimension and itself, and the dependency between each other is ignored; 2) the weights of these correspondences are equal and the differences in different dimensions are neglected.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides an optical flow calculation method based on an adaptive correlation convolutional neural network. Aiming at the image pair with any size, the invention carries out pixel matching by utilizing the self-adaptive correlation layer to finally obtain the light flow graph. Compared with the FlowNet C comprising a common correlation layer, the method can improve the accuracy and the robustness of optical flow calculation without increasing the calculation time and the space consumption.
An optical flow computing method based on an adaptive correlation convolution neural network is characterized by comprising the following steps:
step 1, adding 256 1 × 1 convolution kernels before common correlation operation in an original FlowNet C model to obtain an improved self-adaptive correlation convolution network;
step 2: and inputting a data set, and training the improved self-adaptive correlation convolution neural network to obtain the trained self-adaptive correlation convolution neural network.
And step 3: two images with any size are given, and a trained self-adaptive correlation convolution neural network is input to obtain an optical flow graph between the two images.
In the network training described in step 2, the network loss function L is:
Figure BDA0002233536930000021
wherein S is a scale index, S is a total number of scales, (x, y) represents a pixel coordinate in the image, and w is a color valuesIs the weight of the scale s, WsIs the width of the optical flow image at the scale s, HsIs the height of the optical flow image at the scale s,
Figure BDA0002233536930000022
for the estimated optical flow vector at pixel (x, y),
Figure BDA0002233536930000023
is the true optical flow vector at pixel (x, y);
the initial learning rate in the training process is 0.0001, the learning rate of each 30 training rounds is reduced by 10 times, and 100 training rounds are trained in total. Network parameters were optimized using a batch adaptive gradient descent algorithm with the batch size set to 8.
The method has the advantages that the conventional FlowNet C model is improved, 1 × 1 convolution is added before common correlation operation, correlation among different dimensionalities of features is mined, metric learning is integrated into a deep neural network, the problems of the FlowNet C algorithm are solved, the improved adaptive correlation convolution neural network is used for optical flow calculation, and the accuracy and the robustness of the optical flow calculation can be improved without increasing calculation time and space consumption.
Drawings
FIG. 1 is a schematic diagram of an adaptive correlation convolutional neural network of the present invention.
Detailed Description
The present invention will be further described with reference to the following drawings and examples, which include, but are not limited to, the following examples.
The invention provides an optical flow calculation method based on an adaptive correlation convolution neural network. The realization process is as follows:
1. improved FlowNet C model
Recorded in the literature "Alexey Dosovitskiy, Philipp Fischer, Eddy Ilg, Philip
Figure BDA0002233536930000024
The FlowNet C model in the Caner Hazirbas, Vladimir Golkov, Patrick van der Smog, Daniel Cremers, andTimas Brox, Learning optical flow with volumetric network, CVPR, 2015' has a relevant operation layer for comparing the similarity of each position of two optical flow graphs, and can more accurately carry out the end-to-end estimation of the optical flow graphs. But the method has the problem of neglecting the coupling relation among the dimensions of the feature map, so that the optical flow robustness is poor.
In order to overcome the problems, learnable linear mapping can be used for mining the correlation between different dimensions of features, common Euclidean distance is generalized into Mahalanobis distance, so that metric learning is merged into a deep neural network, therefore, 1 × 1 convolution is added before common correlation operation in a model of the method, and the number of convolution kernels is 256.
2. Network training
The step 1 modified FlowNetC model was trained using the published data set. This example uses the Sintel dataset and the FlyingChairs dataset proposed by Dosockey et al in the documents "Daniel J.Butler1, Jonas Wulff2, Garrett B.Stanley3, and Michael J.Black2, A natual office moviee for optical flow evaluation, ECCV 2012", respectively, and the FlyingChairs dataset proposed by Dosockey et al in the documents "Alexey Dosovitskiy, Philippfier, Eddy Ilg, Philip Hausser, Caner Haz1 rbs, and Vladimir Gokov, Flowet: Learning optical flow with connected volumes works, CVPR, 2015", respectively, divided into training and test sets according to literature methods, to adjust the noise in the original dataset and to verify the validity of the method of the invention, and add random noise to the images.
The training process employs the following multi-scale endpoint error loss function L:
Figure BDA0002233536930000031
wherein S is a scale index, S is a total number of scales, (x, y) represents a pixel coordinate in the image, and w is a color valuesIs the weight of the scale s, WsIs the width of the optical flow image at the scale s, HsIs the height of the optical flow image at the scale s,
Figure BDA0002233536930000032
for the estimated optical flow vector at pixel (x, y),
Figure BDA0002233536930000033
is the true optical flow vector at pixel (x, y). The learning rate in the training process is 0.0001, each 30 training rounds are reduced by 10 times, and 100 training rounds are trained in total. Network parameters were optimized using a batch adaptive gradient descent algorithm with the batch size set to 8.
3. Computing an optical flow graph
Inputting two images with given arbitrary size into a trained self-adaptive correlation convolution neural network to obtain an optical flow graph between the two images. PyTorch software is adopted to realize the steps, and the specific steps are as follows:
step 1: any two frames of image I in given video1,I2∈R3×H×WWhere H, W are the height and width of the image, respectively, and the "convolution-activation" operation is performed three times as follows to obtain feature maps out _ conv3a and out _ conv3b, respectively:
out_conv1a=conv1(I1)
out_conv2a=conv2(out_conv1a)
out_conv3a=conv3(out_conv2a)
out_conv1b=conv1(I2)
out_conv2b=conv2(out_conv1b)
out_conv3b=conv3(out_conv2b)
wherein, conv1 (-), conv2 (-), and conv3 (-), are convolution-activation operation functions respectively.
Step 2: performing adaptive correlation operation on the feature maps out _ conv3a and out _ conv3b to obtain a feature map out _ correlation:
out_conv3a_=conv(out_conv3a)
out_conv3b_=conv(out_conv3b)
out_correlation=corr(out_conv3a_,out_conv3b_)
where conv (-) is a convolution function of 1 × 1 and corr (-) is a correlation operation function.
And step 3: performing convolution-activation operation on the feature map out _ conv3a, and then performing concatenation with the feature map out _ convolution to obtain a concatenated feature map in _ conv3_ 1:
out_conv_redir=conv_redir(out_conv3a)
in_conv3_1=cat(out_conv_redir,out_correlation)
where conv _ redir (-) is the "convolution-activate" operation function, and cat (-) is the channel concatenation operation function.
And 4, step 4: performing a series of convolution-activation operations on the feature map in _ conv3_1 to obtain feature maps out _ conv3, out _ conv4, out _ conv5 and out _ conv6, respectively, that is:
out_conv3=conv3_1(in_conv3_1)
out_conv4=conv4_1(conv4(out_conv3))
out_conv5=conv5_1(conv5(out_conv4))
out_conv6=conv6_1(conv6(out_conv5))
wherein, conv4 (-), conv5 (-), conv6 (-), conv4_1 (-), conv5_1 (-), and conv6_1 (-), are respectively "convolution-activation" operation functions.
And 5: the feature map out _ conv6 is subjected to convolution-activation operation and concatenation operation, resulting in feature maps flow6, flow6_ up, out _ deconv5, namely:
flow6=predict_flow6(out_conv6)
flow6_up=cat(upsampled_flow6_to_5(flow6),out_conv5)
out_deconv5=cat(deconv5(out_conv6),out_conv5)
wherein, predict _ flow6 (-), upsampled _ flow6_ to _5 (-), and deconv5 (-), are "convolution-activation" operation functions, respectively.
Step 6: the following operations are carried out on the characteristic maps to obtain characteristic maps concat5, flow5, flow5_ up and out _ deconv4 respectively:
concat5=cat(out_conv5,out_deconv5,flow6_up)
flow5=predict_flow5(concat5)
flow5_up=cat(upsampled_flow5_to_4(flow5),out_conv4)
out_deconv4=cat(deconv4(concat5),out_conv4)
wherein, predict _ flow5 (-), upsampled _ flow5_ to _4 (-), and deconv4 (-), are "convolution-activation" operation functions, respectively.
And 7: the following operations are carried out on the characteristic maps to obtain characteristic maps concat4, flow4, flow4_ up and out _ deconv3 respectively:
concat4=cat(out_conv4,out_deconv4,flow5_up)
flow4=predict_flow4(concat4)
flow4_up=cat(upsampled_flow4_to_3(flow4),out_conv3)
out_deconv3=cat(deconv3(concat4),out_conv3)
wherein, predict _ flow4 (-), upsampled _ flow4_ to _3 (-), and deconv3 (-), are "convolution-activation" operation functions, respectively.
And 8: the following operations are carried out on the characteristic maps to obtain characteristic maps concat3, flow3, flow3_ up and out _ deconv2 respectively:
concat3=cat(out_conv3,out_deconv3,flow4_up)
flow3=predict_flow3(concat3)
flow3_up=cat(upsampled_flow3_to_2(flow3),out_conv2a)
out_deconv2=cat(deconv2(concat3),out_conv2a)
wherein, predict _ flow3 (-), upsampled _ flow3_ to _2 (-), and deconv2 (-), are "convolution-activation" operation functions, respectively.
And step 9: the final optical flow map flow2 is obtained by performing the following operations on the characteristic map:
concat2=cat(out_conv2a,out_deconv2,flow3_up)
flow2=predict_flow2(concat2)
where predict _ flow2(·) is the "convolution-activate" operation function.
In order to verify the effect of the method of the present invention, the CPU is
Figure BDA0002233536930000062
The method comprises the steps of respectively carrying out simulation experiments on three data sets of Sintel Clean, Sintel Final and flight curves by using Python software and a PyTorch deep learning framework on an i7-6800K 3.40GHz CPU, an NVIDIAGeForce GTX 1080GPU and an Ubuntu operating system. The optical flow calculation method using the FlowNetC model and the optical flow calculation method of the present invention were compared, and the average endpoint error, the running time, and the model size of the methods were calculated, respectively, and the calculation results are shown in table 1. In the table, Sintel Clean, Sintel Final, and folding Chairs represent three original data sets, respectively, with the suffix "_ L" being a data set of increasing luminance change and "_ N" being a data set of increasing noise. According to the calculation result data, the method has higher calculation accuracy under the condition that the test time and the model size are hardly increased; for the data set with increased brightness variation and noise, the method of the invention has better performance and higher robustness. In conclusion, the method has high calculation accuracy and robustness and strong practicability.
TABLE 1
Figure BDA0002233536930000061

Claims (2)

1. An optical flow computing method based on an adaptive correlation convolution neural network is characterized by comprising the following steps:
step 1, adding 256 1 × 1 convolution kernels before common correlation operation in an original FlowNet C model to obtain an improved self-adaptive correlation convolution network;
step 2: and inputting a data set, and training the improved self-adaptive correlation convolution neural network to obtain the trained self-adaptive correlation convolution neural network.
And step 3: two images with any size are given, and a trained self-adaptive correlation convolution neural network is input to obtain an optical flow graph between the two images.
2. The optical flow computation method based on the adaptive correlation convolutional neural network as claimed in claim 1, wherein: in the network training described in step 2, the network loss function L is:
Figure FDA0002233536920000011
wherein S is a scale index, S is a total number of scales, (x, y) represents a pixel coordinate in the image, and w is a color valuesIs the weight of the scale s, WsIs the width of the optical flow image at the scale s, HsIs the height of the optical flow image at the scale s,
Figure FDA0002233536920000012
for the estimated optical flow vector at pixel (x, y),
Figure FDA0002233536920000013
is the true optical flow vector at pixel (x, y);
the initial learning rate in the training process is 0.0001, the learning rate of each 30 training rounds is reduced by 10 times, and 100 training rounds are trained; network parameters were optimized using a batch adaptive gradient descent algorithm with the batch size set to 8.
CN201910980474.XA 2019-10-15 2019-10-15 Optical flow calculation method based on self-adaptive correlation convolution neural network Pending CN111369595A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910980474.XA CN111369595A (en) 2019-10-15 2019-10-15 Optical flow calculation method based on self-adaptive correlation convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910980474.XA CN111369595A (en) 2019-10-15 2019-10-15 Optical flow calculation method based on self-adaptive correlation convolution neural network

Publications (1)

Publication Number Publication Date
CN111369595A true CN111369595A (en) 2020-07-03

Family

ID=71210044

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910980474.XA Pending CN111369595A (en) 2019-10-15 2019-10-15 Optical flow calculation method based on self-adaptive correlation convolution neural network

Country Status (1)

Country Link
CN (1) CN111369595A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634324A (en) * 2020-12-07 2021-04-09 中国地质大学(武汉) Optical flow field estimation method based on deep convolutional neural network
CN114005075A (en) * 2021-12-30 2022-02-01 深圳佑驾创新科技有限公司 Construction method and device of optical flow estimation model and optical flow estimation method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107967695A (en) * 2017-12-25 2018-04-27 北京航空航天大学 A kind of moving target detecting method based on depth light stream and morphological method
CN109711316A (en) * 2018-12-21 2019-05-03 广东工业大学 A kind of pedestrian recognition methods, device, equipment and storage medium again
CN110111366A (en) * 2019-05-06 2019-08-09 北京理工大学 A kind of end-to-end light stream estimation method based on multistage loss amount

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107967695A (en) * 2017-12-25 2018-04-27 北京航空航天大学 A kind of moving target detecting method based on depth light stream and morphological method
CN109711316A (en) * 2018-12-21 2019-05-03 广东工业大学 A kind of pedestrian recognition methods, device, equipment and storage medium again
CN110111366A (en) * 2019-05-06 2019-08-09 北京理工大学 A kind of end-to-end light stream estimation method based on multistage loss amount

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ALEXEY DOSOVITSKIY等: "《FlowNet: Learning Optical Flow with Convolutional Networks》", 《2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 *
ANURAG RANJAN等: "《Optical Flow Estimation Using a Spatial Pyramid Network》", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 *
周文俊等: "《基于光流的快速人体姿态估计》", 《计算机***应用》 *
王松: "《抗遮档的光流场估计算法研究》", 《中国博士论文全文数据库 信息科技辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634324A (en) * 2020-12-07 2021-04-09 中国地质大学(武汉) Optical flow field estimation method based on deep convolutional neural network
CN114005075A (en) * 2021-12-30 2022-02-01 深圳佑驾创新科技有限公司 Construction method and device of optical flow estimation model and optical flow estimation method
CN114005075B (en) * 2021-12-30 2022-04-05 深圳佑驾创新科技有限公司 Construction method and device of optical flow estimation model and optical flow estimation method

Similar Documents

Publication Publication Date Title
CN109800692B (en) Visual SLAM loop detection method based on pre-training convolutional neural network
Liu et al. Local similarity pattern and cost self-reassembling for deep stereo matching networks
CN110796026A (en) Pedestrian re-identification method based on global feature stitching
CN109974743B (en) Visual odometer based on GMS feature matching and sliding window pose graph optimization
CN112784728B (en) Multi-granularity clothes changing pedestrian re-identification method based on clothing desensitization network
CN111144376B (en) Video target detection feature extraction method
CN111723798B (en) Multi-instance natural scene text detection method based on relevance hierarchy residual errors
US9025863B2 (en) Depth camera system with machine learning for recognition of patches within a structured light pattern
KR102094506B1 (en) Method for measuring changes of distance between the camera and the object using object tracking , Computer readable storage medium of recording the method and a device measuring changes of distance
CN114677565B (en) Training method and image processing method and device for feature extraction network
CN110442618B (en) Convolutional neural network review expert recommendation method fusing expert information association relation
CN111369595A (en) Optical flow calculation method based on self-adaptive correlation convolution neural network
CN110751027A (en) Pedestrian re-identification method based on deep multi-instance learning
CN112651406A (en) Depth perception and multi-mode automatic fusion RGB-D significance target detection method
CN110751076A (en) Vehicle detection method
CN110598711B (en) Target segmentation method combined with classification task
CN116310098A (en) Multi-view three-dimensional reconstruction method based on attention mechanism and variable convolution depth network
CN110188864B (en) Small sample learning method based on distribution representation and distribution measurement
CN110796182A (en) Bill classification method and system for small amount of samples
CN114926742A (en) Loop detection and optimization method based on second-order attention mechanism
CN109145738B (en) Dynamic video segmentation method based on weighted non-convex regularization and iterative re-constrained low-rank representation
CN112836629B (en) Image classification method
CN110942468B (en) Interactive image segmentation method based on superpixel pair probability transfer
CN112270760A (en) Positioning method, positioning device, electronic equipment and storage medium
CN116796248A (en) Forest health environment assessment system and method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200703