CN113963052B

CN113963052B - Large aerostat volume real-time monitoring method based on binocular vision

Info

Publication number: CN113963052B
Application number: CN202111108084.7A
Authority: CN
Inventors: 张涛; 杨朝旭; 荣海军; 陶思宇; 王瑞; 刘泽华; 刘馨媛; 张少杰
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2021-09-22
Filing date: 2021-09-22
Publication date: 2023-08-18
Anticipated expiration: 2041-09-22
Also published as: CN113963052A

Abstract

A binocular vision-based real-time monitoring method for the volume of a large aerostat comprises the following steps: acquiring a binocular image, and dividing the binocular image by adopting a binocular difference image dividing algorithm based on mean shift; the gray information of the parallax inner points corresponding to the binocular images in the specific window is used as a reference to realize the segmentation of the overlapping areas of the binocular images, and non-corresponding pixel points in the binocular images are removed to obtain a target area; three-dimensional reconstruction of the target area is realized by adopting a three-dimensional matching algorithm based on self-adaptive threshold epipolar distance transformation, and a parallax image is obtained; depth information corresponding to each pixel of the parallax image is obtained, and a point cloud is generated through three-dimensional reconstruction; removing noise in the three-dimensional reconstruction generated point cloud, and reducing the data volume of the point cloud through downsampling; completing point cloud splicing based on physical position relation among cameras; fitting the point cloud based on quadratic function interpolation to obtain a complete point cloud; and solving the point cloud volume of the complete point cloud by using a slicing method. The binocular vision reconstruction method has high accuracy and high calculation efficiency.

Description

Large aerostat volume real-time monitoring method based on binocular vision

Technical Field

The invention belongs to the technical field of non-contact volume measurement, and particularly relates to a binocular vision-based real-time monitoring method for the volume of a large aerostat.

Background

The aerostat is a novel flight platform with various functions in the near-earth space, and along with the development of aviation technology, the platform integrates the purposes of air monitoring, communication platform and the like, and is very significant in military and civil aspects, so the aerostat becomes a hotspot for the current international near-earth space aircraft technology and military application research. The soft aerostat is the most developed and most commonly used type of aerostat in the world, the appearance of the aerostat is mainly embodied by an air bag, the aerostat is maintained by means of the difference value of the internal pressure and the external pressure of the air bag, and the aerostat is lifted and is mainly lifted by means of the buoyancy provided by the gravity of the aerostat for discharging air, and the aerostat is expressed as the density difference between the internal pressure and the external pressure of the air bag. The aerostat bag body is subjected to volume measurement, so that the tightness of the bag body can be detected, the net buoyancy of the aerostat can be obtained, the change of the bag body in the inflation process can be monitored, and the aerostat bag body has stronger practical significance for aerostat flight performance, corresponding flight floating parameters and the like.

Currently, the volume calculation of an irregular target generally needs to perform three-dimensional reconstruction on the target, and then the target volume is obtained by utilizing the point cloud information of the target. Binocular vision has become an important method in the field of three-dimensional reconstruction due to the advantages of low hardware requirements, convenient deployment and the like. When facing a long-distance large target scene, the binocular stereoscopic vision three-dimensional reconstruction method can cause the performance reduction of the existing algorithm and the remarkable improvement of the calculated amount due to the problems that the target is too huge, the color of the target is similar to the background, the pixels of the local area of the target cannot be equal, so that the research of the large aerostat volume real-time monitoring method has important theoretical significance and application value.

Disclosure of Invention

The invention aims to solve the problems in the prior art, and provides a binocular vision-based real-time monitoring method for the volume of a large aerostat, which effectively improves the precision and the calculation efficiency of binocular vision when the binocular vision is applied to the large aerostat.

In order to achieve the above purpose, the present invention has the following technical scheme:

a binocular vision-based real-time monitoring method for the volume of a large aerostat comprises the following steps:

acquiring a binocular image, and dividing the binocular image by adopting a binocular difference image dividing algorithm based on mean shift;

the gray information of the parallax inner points corresponding to the binocular images in the specific window is used as a reference to realize the segmentation of the overlapping areas of the binocular images, and non-corresponding pixel points in the binocular images are removed to obtain a target area;

three-dimensional reconstruction of the target area is realized by adopting a three-dimensional matching algorithm based on self-adaptive threshold epipolar distance transformation, and a parallax image is obtained;

depth information corresponding to each pixel of the parallax image is obtained, and a point cloud is generated through three-dimensional reconstruction;

removing noise in the three-dimensional reconstruction generation point cloud, and reducing the data volume of the point cloud through downsampling;

completing point cloud splicing based on physical position relation among cameras;

fitting the point cloud based on quadratic function interpolation to obtain a complete point cloud;

and solving the point cloud volume of the complete point cloud by using a slicing method.

As a preferable scheme of the invention, the binocular image acquisition is completed through a real-time monitoring system, the real-time monitoring system comprises a 5K image real-time acquisition and transmission subsystem and a volume calculation and visualization subsystem, the volume calculation and visualization subsystem receives signals of the 5K image real-time acquisition and transmission subsystem and then processes and calculates the signals, and three-dimensional point cloud information and a volume calculation result of the aerostat are output; the system is characterized in that a binocular camera module is arranged in the 5K image real-time acquisition and transmission subsystem, the focal length of the binocular camera module is firstly adjusted to enable the binocular camera module to clearly acquire image data of an aerostat, then the calibration of internal and external parameters and distortion coefficients of the binocular camera module is carried out by using a Zhang Zhengyou calibration method, and the calibration result is transmitted to the volume calculation and visualization subsystem.

As a preferable scheme of the invention, the 5K image real-time acquisition and transmission subsystem consists of eight sets of 5K image real-time acquisition devices, and the eight sets of 5K image real-time acquisition devices are arranged around the target aerostat according to lens parameters of the eight sets of 5K image real-time acquisition devices, a base line distance between two eyes and the size of the target aerostat, so that image information of the target aerostat can be completely acquired.

As a preferred embodiment of the present invention, the MATLAB kit Stereo Camera Calibrator is used to correct the acquired images for each view angle when acquiring the binocular images.

As a preferable scheme of the invention, the method for dividing the binocular image by adopting the binocular differential image dividing algorithm based on mean shift specifically comprises the following steps:

taking the division of a left image in a binocular image as an example;

after the parallax range is obtained, the whole pixel of the right image is shifted to the right in the parallax range;

R _d (x,y)＝R(x-d,y)

wherein R (x, y) is the right image before translation; r is R _d (x, y) is the right image after translation, R _d (x, y) the corresponding translation value is d;

and respectively differentiating the right image after the translation corresponding to each parallax with the left image, and carrying out threshold segmentation on the differential image:

D _d (x，y)＝|L(x，y)-R _d (x-d，y)|

d in _d (x, y) is a differential image corresponding to the parallax d; l (x, y) is the left image; m is M _d (x, y) is a binary image obtained after threshold segmentation; t (T) _d Is a differential threshold;

overlapping the binary images corresponding to each parallax, and dividing a target area:

s (x, y) is a differential image superposition result diagram; d, d _min And d _max Is the minimum maximum parallax; t is a threshold corresponding to segmentation of the superimposed image;

the right image is differentiated from the right-shifted left image to partition the target region.

As a preferred scheme of the present invention, the specific steps for implementing the segmentation of the overlapping region of the binocular image are as follows: selecting left end pixels of a right image for any polar line in the binocular image, and counting gray values of pixels in a right neighborhood; finding out a corresponding pixel in the parallax range, and counting the gray value of the pixel in the neighborhood; comparing the gray values in the two adjacent domains, and using Euclidean distance to represent the gray similarity of the two adjacent domains;

wherein: i _l (x, y) is the gray value of the left image at point (x, y); i _r (x, y) is the gray value of the right image at point (x, y); d is parallax; s is S _l (d) Similarity of the regions corresponding to the parallaxes d; wherein M represents the length of the right neighborhood;

after obtaining the Euclidean distance corresponding to each parallax value, comparing and obtaining the minimum Euclidean distance and the parallax value corresponding to the minimum Euclidean distance;

removing a left area corresponding to the parallax in the left image;

all polar lines in the binocular image are subjected to the above operation, so that the non-overlapping area of the left image is removed;

similarly, selecting pixels on the right side of the left image and pixels on the right image in a corresponding parallax range as corresponding points; comparing the gray scale of the left neighborhood pixel of the corresponding point to obtain the Euclidean distance corresponding to each parallax value; finding the parallax corresponding to the minimum Euclidean distance, and removing the right side area of the pixel at the parallax in the right image; in the process of removing the non-overlapping on the right side, the Euclidean distance is calculated by using the following steps:

wherein: w is the image width.

As a preferable scheme of the invention, the three-dimensional reconstruction of the target area is realized by the three-dimensional matching algorithm based on the adaptive threshold epipolar distance transformation, which comprises the following specific steps:

when three-dimensional reconstruction is realized by utilizing a three-dimensional matching algorithm of polar line distance transformation, selecting different thresholds for different areas of an image to perform distance transformation; obtaining a distance conversion threshold corresponding to each pixel by using the local information richness of the image;

the image local quality is represented using a linear combination of the image gray gradient and the image local variance:

G(x,y)＝M(x,y)+V(x,y)

wherein: g (x, y) is the local quality of the image at pixel points (x, y) in the image; m (x, y) is the gray gradient at the pixel point (x, y); v (x, y) is the local variance at pixel point (x, y);

the image gray gradient is used for measuring the change rate of gray relation between adjacent pixels in the image, and the solving method is as follows:

wherein: g _x 、g _y Gradients of the image in the x, y directions respectively;

wherein the gradient is expressed using a sobel operator:

g _x ＝I(x+1，y-1)-I(x-1,y-1)+2*I(x+1,y)-2*I(x-1,y)+I(x+1,y+1)-I(x-1,y+1)

g _y ＝I(x-1,y+1)-I(x-1,y-1)+2*I(x,y+1)-2*I(x,y-1)+I(x+1,y+1)-I(x+1,y-1)

wherein: i (x, y) is the gray value at the pixel point (x, y) in the image;

the local variance of the image is used for measuring the richness of gray information in the image, and the solving method is as follows:

wherein: l is the number of pixels in the window; i (p) is the gray value corresponding to the p-th pixel;representing the gray average value of all pixels in the window;

adjusting contrast using gamma transformation;

the gamma transformation is divided into three steps of normalization, pre-compensation and inverse normalization;

the normalized calculation formula is as follows:

wherein: g (x, y) is the region information at the pre-normalization coordinates (x, y); i' (x, y) is the region information at the normalized coordinates (x, y); g _m ，G _l Respectively the maximum and minimum values of the local information of the image;

since the minimum value of the corresponding local information in the graph is 0, the above equation is converted into the following form:

pre-compensation is carried out after normalization, and the expression is as follows:

wherein: i "(x, y), I' (x, y) is the region information at the pixel point (x, y) before and after pre-compensation, respectively; gamma is an index value in precompensation for adjusting contrast;

the pre-compensation is followed by inverse normalization as follows:

wherein: i' (x, y) is the value of the region information of the image at the coordinates (x, y) after gamma transformation;

the gamma selection value is 0.3;

after gamma transformation is carried out to adjust contrast, the corresponding richness of different areas is obtained by normalization:

after obtaining the richness corresponding to different areas, obtaining the polar line distance transformation threshold corresponding to each pixel in the image by taking the maximum parallax value as a reference, and expressing the polar line distance transformation threshold by a (x, y):

after the corresponding threshold value is obtained, performing distance transformation;

for all pixels on each polar line, defining x as the corresponding order of pixels with non-zero gray values on the polar line, wherein the corresponding gray information is I (x), the corresponding abscissa in the image is c (x), judging whether the following relationship is satisfied,

c(x)-c(x-1)>T1

I(x)-I(x-1)>T2

wherein: t1 and T2 are respectively a coordinate threshold value and a gray threshold value;

if the relation is satisfied, defining the x-th non-zero pixel on the polar line as a zero point, the area between adjacent zero points is called an original support domain, performing distance transformation after obtaining the original support domain, searching a point p in the original support domain on the polar line leftwards and rightwards in the original support domain, and if the gray value difference value is larger than a (x, y), then, regarding the point p as a left end point and a right end point:

‖I(x _l ,y)-I(x,y)‖>a(x,y)

‖I(x,y)-I(x _r ,y)‖>a(x,y)

wherein: i (x, y) is a gray value of a point with coordinates (x, y); a (x, y) is a distance conversion threshold value for each pixel;

if a pixel (x) _l Y) is greater than the parameter a (x, y), the point is set to the left end point x _l The method comprises the steps of carrying out a first treatment on the surface of the The same is true for the right end, and the right end point is denoted as x _r If no corresponding point is found in the original support domain, defining the end points corresponding to the original support domain as left and right end points, and obtaining a segmentation coefficient corresponding to the pixel on the polar line:

wherein: x is x _l And x _r The left and right endpoints are respectively; p is the current coordinate, and the assignment of pixels in the image is realized according to the segmentation coefficient;

and finally, carrying out three-dimensional matching on the binocular image by utilizing a difference square sum algorithm:

C(p _T (u _l ,v _l ),d)＝(I _L (u _l ,v _l )-I _R (u _l -d,v _l )) ²

wherein: p is p _L (u _l ,v _l ) The pixel points are the pixels of the images to be matched; d is the disparity value; c (p) _L (u _l ,v _l ) D) is a cost function corresponding to the pixel points of the image to be matched and the corresponding image matching points;

after obtaining the cost function, I _L (u _l ,v _l ) And I _R (u _l -d,v _l ) For the center, a window with the size of k multiplied by j is constructed, and cost aggregation is carried out:

taking the minimum value as a disparity value to finish stereo matching:

wherein: d, d _min ,d _max Respectively minimum and maximum disparities.

As a preferable scheme of the invention, the removing of noise in the three-dimensional reconstruction generation point cloud is performed from two angles of parallax optimization and point cloud optimization;

the parallax optimization method comprises the following steps:

detecting left-right consistency of parallax images corresponding to the left-right images; for any point I in the left graph _L (u _l ,v _l ) The corresponding parallax is d _l Then the point I _L (u _l ,v _l ) The corresponding point in the right graph is I _R (u _l -d _l ,v _l ) The method comprises the steps of carrying out a first treatment on the surface of the And the disparity value of the corresponding point of the right graph is d _r If a mismatch occurs, d _l And d _r There is a difference between them; points satisfying the following conditions are defined as mismatching points:

|d _l -d _r |<t

wherein: t is a threshold value for judging whether the left-right consistency is met;

obtaining error matching points by carrying out consistency detection on the parallax map, searching pixel points with non-zero parallax values leftwards and rightwards along polar lines for each error matching point, and marking the left parallax value and the right parallax value which are searched as d respectively ₁ And d ₂ ；

And then, assigning the mismatching point by using the average value of the two parallaxes:

the point cloud optimization removes mismatching points by using a median filtering method, and smoothes the parallax map to obtain smooth three-dimensional point cloud;

the point cloud downsampling vomits the space point cloud, finds out the maximum and minimum values of X, Y and Z coordinates corresponding to the point cloud, sets the side length of the voxels, obtains the number of voxels corresponding to the X, Y and Z coordinates, and records the point cloud corresponding to each voxel; and finally, centering the point cloud in each voxel, and reserving the point closest to the gravity center of the voxel to realize the down sampling of the point cloud to obtain an output point cloud.

As a preferred solution of the present invention, the calculation expression for obtaining the depth information corresponding to each pixel of the parallax map is as follows:

wherein: f is the focal length; b is the baseline distance of the binocular camera; d is the calculated parallax; z is the depth of the target pixel point;

when the point cloud volume of the whole point cloud is solved by using a slicing method, an external polygon is solved for the points in each slice, the area of the polygon is calculated, and finally the whole point cloud volume is obtained through integration, and the calculation expression is as follows:

wherein: s is S ₁ ,S ₂ The areas corresponding to the upper and lower slices are respectively; Δh is the distance between adjacent slices.

As a preferred scheme of the invention, the point cloud volume solution is calculated in real time by constructing a CPU and GPU heterogeneous computing platform as computing resources and utilizing a CUDA parallel programming mode, wherein the CPU in the CPU and GPU heterogeneous computing platform is used as a manager for data input and output and memory resource allocation, and the GPU is used as a coprocessor to finish the calculation of the CPU transmission data; and when CUDA is programmed, a python language is adopted to write and execute a kernel function at the GPU end and a control logic code at the CPU end, the CPU end code is compiled and executed by a python compiler in the execution process, and the kernel function is compiled and executed at the GPU end by an nvcc compiler.

Compared with the prior art, the invention has at least the following beneficial effects:

the binocular image is segmented by adopting a binocular differential image segmentation algorithm based on mean shift, and three-dimensional reconstruction of a target area is realized by adopting a three-dimensional matching algorithm based on self-adaptive threshold epipolar distance transformation, so that good performance results are respectively obtained on the target segmentation in the binocular image field and the mismatching phenomenon caused by the existence of weak textures and repeated texture features on the target surface. Meanwhile, in order to improve the calculation speed, on one hand, depth information corresponding to each pixel of the parallax image is obtained, the prior information can reduce the operation amount in the subsequent algorithm processing process, and on the other hand, the point cloud volume of the whole point cloud is solved by using a slicing method.

Furthermore, according to the point cloud volume solving method, the CPU and the GPU heterogeneous computing platform are built to serve as computing resources, the CUDA parallel programming mode is utilized for real-time computing, the GPU and other hardware is used for building the computing platform to conduct parallel computing, computing efficiency is greatly improved, and therefore intelligent non-contact type capsule volume measurement is achieved rapidly.

Drawings

FIG. 1 is a hardware connection diagram of a binocular vision-based large aerostat volume real-time monitoring system of the present invention;

FIG. 2 is a flow chart of data processing in the volumetric calculation and visualization subsystem of the present invention;

FIG. 3 is a layout of a 5K image real-time acquisition and transmission subsystem of the present invention;

FIG. 4 is a flow chart of a binocular differential image segmentation algorithm based on mean shift of the present invention;

FIG. 5 is a flow chart of a stereo matching algorithm based on adaptive threshold epipolar distance transformation according to the present invention;

FIG. 6 is a flow chart of the CUDA-based point cloud volume acceleration calculation of the present invention;

in the accompanying drawings: the 1-5K image real-time acquisition and transmission subsystem; 2-volume calculation and visualization subsystem; 3-optical fiber; 4-coaxial cable; 5-twisted pair; 6-5K image real-time acquisition equipment; 7-test sites; 8-target aerostat.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings.

Referring to fig. 1, the embodiment of the invention firstly builds a binocular vision-based large aerostat volume real-time monitoring system, which consists of a 5K image real-time acquisition and transmission subsystem 1, a volume calculation and visualization subsystem 2 and connecting wires between the two. The connection line comprises an optical fiber line 3, a coaxial cable 4 and a twisted pair 5. The 5K image real-time acquisition and transmission subsystem 1 comprises eight groups of completely identical 5K image real-time acquisition devices 6, and the eight groups of completely identical 5K image real-time acquisition devices are arranged at specified positions and can acquire binocular picture data, monitoring video stream data and laser ranging data of a target aerostat, and the data are transmitted to the high-performance volume calculation and visualization subsystem 2 in real time through the optical fiber line 3, the coaxial cable 4 and the twisted pair 5.

Wherein, each set of 5K image real-time acquisition equipment 6 includes: binocular vision module, monitoring module, laser rangefinder module and power module. The binocular vision module, the monitoring module and the laser ranging module are responsible for acquiring real-time data, the data are transmitted to the high-performance volume calculation and visualization subsystem in real time through corresponding data buses, signals in the volume calculation and visualization subsystem 2 respectively pass through the data exchange module and the digital image processing module, and finally the obtained three-dimensional point cloud result and the volume calculation result are displayed on the display module, wherein the digital image processing module bears the processing of performing the processes of image correction, image segmentation, stereo matching, three-dimensional reconstruction, point cloud splicing, volume calculation and the like on the image data.

Referring to fig. 2, binocular picture data, monitoring video stream data and laser ranging data are transmitted to a high-performance volume calculation and visualization subsystem, wherein the monitoring video stream data are directly displayed on a display through a high-definition video recorder, the laser ranging data are stored in a memory and are called in a subsequent image segmentation and stereo matching search disparity map, and the calculation efficiency is improved. The binocular image data are sequentially subjected to image correction, image segmentation, overlapping region segmentation, stereo matching, three-dimensional reconstruction, point cloud splicing, volume calculation and other processes, and finally the obtained three-dimensional point cloud result and volume calculation result are output on a display.

Referring to fig. 3, due to the huge size of the target aerostat, considering the lens parameters of the 5K image real-time acquisition device and the baseline distance between the two eyes, how to reasonably arrange the 5K image real-time acquisition device is a key for determining that the two eyes can be completely and effectively acquired, in this example, a lens with a focal length of 12mm is adopted, the base distance of the two eyes camera in the 5K image real-time acquisition device 6 is 1.2m, and by calculation, the arrangement scheme shown in fig. 3 can effectively acquire the two eyes images of the target, namely three images are respectively placed on the side edge of the target aerostat 8, and one image is respectively placed on the top and the tail of the target aerostat 8.

Referring to fig. 4, the binocular image can be largely divided into three parts of noisy background, foreground object and foreground noise. Compared with a monocular image, the binocular image can obtain three-dimensional information of pixel points in the image, so that the three-dimensional information is utilized to remove most background areas. The foreground target and the foreground noise have obvious color difference, and the average shift algorithm is utilized to remove the foreground noise.

(1) After the parallax range is obtained by laser ranging, the whole pixel of the right image is shifted to the right in the parallax range;

R _d (x,y)＝R(x-d,y)

wherein: r (x, y) is the right image before translation; r is R _d (x, y) is the right image after translation, wherein R _d The corresponding translation value of (x, y) is d.

(2) The right image after the translation corresponding to each parallax is differentiated from the left image. The gray value corresponding to the target area after the difference is smaller, and the principle is utilized to carry out threshold segmentation on the difference image:

D _d (x,y)＝|L(x,y)-R _d (x-d,y)|

wherein: d (D) _d (x, y) is a differential image corresponding to the parallax d; l (x, y) is the left image; m is M _d (x, y) is a binary image obtained after threshold segmentation; t (T) _d Is a differential threshold.

(3) The binary images corresponding to each parallax are overlapped, and as the target areas have gray level similarity, after the binocular images are differentiated, the gray level value corresponding to the target areas is smaller than that of the background areas, and the target areas are segmented according to the principle:

wherein: s (x, y) is a differential image superposition result diagram; d, d _min And d _max Minimum and maximum disparities, respectively; t is a threshold corresponding to superimposed image segmentation.

The right image is differentiated from the right-shifted left image to partition the target region. In binocular image differencing, two parameter difference thresholds T are experienced _d And the influence of the superposition threshold T. Considering that the rest background area and the foreground noise area have more obvious color difference compared with the target area. After the binocular differential image is obtained, a target area is segmented by means of a mean shift method. The Mean shift is also called Mean shift algorithm, and the Mean shift vector corresponding to the pixel is calculated, and the Mean shift vector is continuously moved along the vector before meeting the requirement, so that the image smoothing is realized.

Referring to fig. 5, after the binocular image is segmented through the overlapping region, stereo matching is required for the binocular image to achieve three-dimensional reconstruction of the object. The above factors all lead to a mismatch phenomenon due to weak texture and repeated texture features on the target surface. To solve this problem, stereo matching algorithm based on adaptive threshold epipolar distance transformation is proposed to enhance matching accuracy.

The distance conversion threshold corresponding to each pixel is obtained by utilizing the richness of the local information of the image, and the method for counting the local quality of the image is mainly composed of mean square error, image entropy, gradient, mean value and the like.

The present invention uses a linear combination of image gray gradient and image local variance to represent image local quality:

G(x,y)＝M(x,y)+V(x,y)

wherein: g (x, y) is the local quality of the image at pixel points (x, y) in the image; m (x, y) is the gray gradient at the pixel point (x, y); v (x, y) is the local variance at pixel point (x, y).

After the local quality of the image is obtained, solving a distance transformation threshold value corresponding to each pixel. Since the local quality contrast is higher in the edge region than in the non-edge region, the local quality contrast of each region of the image needs to be adjusted.

The invention uses gamma transformation to adjust contrast. The expression is as follows:

wherein: i' (x, y) is the value of the region information of the image at the coordinates (x, y) after gamma transformation.

In order to reduce the contrast between different areas, the areas with high values of the area information are required to be compressed, and the areas with low values are required to be stretched. In order to achieve the above effects, γ should take a value smaller than 1, and the γ is determined to be 0.3 by comparing the results of multiple experiments.

After gamma transformation is carried out to adjust the contrast, the corresponding richness of different areas is obtained by normalization in order to obtain the richness of different areas relative to the image global.

And after obtaining the richness corresponding to different areas, obtaining the polar line distance transformation threshold corresponding to each pixel in the image by taking the maximum parallax value as a reference, and representing the polar line distance transformation threshold by a (x, y).

And after the corresponding threshold value is obtained, performing distance transformation.

Referring to fig. 6, the point cloud volume calculation approach requires that the calculation grid size be small enough to meet the calculation accuracy requirement. However, the smaller the computational grid size, the more hardware resources are consumed for parallel computation, and the limited computational resources will directly impact the speed of parallel computation. In order to accelerate the solution of the solution volume, a CPU+GPU heterogeneous computing platform is built as a computing resource, and real-time computation is achieved through a CUDA parallel programming mode. In the heterogeneous parallel programming mode of CPU and GPU, the CPU is used as a manager for data input and output and memory resource allocation, and the GPU is used as a coprocessor to finish calculation of data transmitted by the CPU. Heterogeneous computing takes advantage of the CPU in terms of logic control and the GPU in terms of processing highly concurrent tasks. When CUDA programs, a python language is adopted to write and execute a kernel function at the GPU side and a control logic code at the CPU side, the CPU side code is compiled and executed by a python compiler in the execution process, and the kernel function is compiled and executed at the GPU side by an nvcc compiler.

Based on the thought of point cloud volume calculation, the invention designs a point cloud volume parallel calculation flow based on CUDA:

a) Firstly, initializing point cloud data of a calculation target at a Host terminal (CPU);

b) Gridding the point cloud data and performing downsampling;

c) Copying the down-sampled point cloud data from a Host end to a Device end (GPU);

d) Performing binomial interpolation on the point cloud;

e) Calculating the volume of the point cloud slice;

repeating the steps (d) - (e) until the required iteration times are reached, and ending the calculation.

The foregoing description of the preferred embodiment of the present invention is not intended to limit the technical solution of the present invention in any way, and it should be understood that the technical solution can be modified and replaced in several ways without departing from the spirit and principle of the present invention, and these modifications and substitutions are also included in the protection scope of the claims.

Claims

1. The real-time monitoring method for the volume of the large aerostat based on binocular vision is characterized by comprising the following steps of:

the specific steps for realizing the segmentation of the overlapping area of the binocular image are as follows: selecting left end pixels of a right image for any polar line in the binocular image, and counting gray values of pixels in a right neighborhood; finding out a corresponding pixel in the parallax range, and counting the gray value of the pixel in the neighborhood; comparing the gray values in the two adjacent domains, and using Euclidean distance to represent the gray similarity of the two adjacent domains;

removing a left area corresponding to the parallax in the left image;

wherein: w is the image width;

the three-dimensional reconstruction method based on the self-adaptive threshold epipolar distance transformation comprises the following specific steps of:

G(x，y)＝M(x，y)+V(x，y)

wherein the gradient is expressed using a sobel operator:

g _x ＝I(x+1，y-1)-I(x-1，y-1)+2*I(x+1，y)-2*I(x-1，y)+I(x+1，y+1)-I(x-1，y+1)

g _y ＝I(x-1，y+1)-I(x-1，y-1)+2*I(x，y+1)-2*I(x，y-1)+I(x+1，y+1)-I(x+1，y-1)

wherein: i (x, y) is the gray value at the pixel point (x, y) in the image;

adjusting contrast using gamma transformation;

the normalized calculation formula is as follows:

the pre-compensation is followed by inverse normalization as follows:

the gamma selection value is 0.3;

c(x)-c(x-1)＞T1

I(x)-I(x-1)＞T2

||I(x _l ，y)-I(x，y)||＞a(x，y)

||I(x，y)-I(x _r ，y)||＞a(x，y)

C(p _L (u _l ，v _l )，d)＝(I _L (u _l ，v _l )-I _R (u _l -d，v _l )) ²

taking the minimum value as a disparity value to finish stereo matching:

wherein: d, d _min ,d _max Minimum and maximum disparities, respectively;

2. The binocular vision-based real-time monitoring method for the volume of the large aerostat, according to claim 1, is characterized in that: the binocular image acquisition is completed through a real-time monitoring system, the real-time monitoring system comprises a 5K image real-time acquisition and transmission subsystem (1) and a volume calculation and visualization subsystem (2), the volume calculation and visualization subsystem (2) processes and calculates after receiving signals of the 5K image real-time acquisition and transmission subsystem (1), and three-dimensional point cloud information and a volume calculation result of the aerostat are output;

the system is characterized in that a binocular camera module is arranged in the 5K image real-time acquisition and transmission subsystem (1), the focal length of the binocular camera module is firstly adjusted to enable the binocular camera module to clearly acquire image data of an aerostat, then the calibration of internal and external parameters and distortion coefficients of the binocular camera module is carried out by using a Zhang Zhengyou calibration method, and the calibration result is transmitted to the volume calculation and visualization subsystem (2).

3. The binocular vision-based real-time monitoring method for the volume of the large aerostat according to claim 2, wherein the method comprises the following steps of: the 5K image real-time acquisition and transmission subsystem (1) consists of eight sets of 5K image real-time acquisition equipment (6), and the eight sets of 5K image real-time acquisition equipment (6) are arranged around the target aerostat (8) according to the parameters of each lens of the eight sets of 5K image real-time acquisition equipment (6), the base line distance between the two eyes and the size of the target aerostat (8), so that the image information of the target aerostat (8) can be completely acquired.

4. The binocular vision-based real-time monitoring method for the volume of the large aerostat, according to claim 1, is characterized in that: the binocular images are acquired by correcting the acquired images for each view using MATLAB kit Stereo Camera Calibrator.

5. The binocular vision-based large aerostat volume real-time monitoring method according to claim 1, wherein the specific steps of segmenting the binocular image by adopting a binocular difference image segmentation algorithm based on mean shift are as follows:

taking the division of a left image in a binocular image as an example;

R _d (x，y)＝R(x-d，y)

D _d (x，y)＝|L(x，y)-R _d (x-d，y)|

6. The binocular vision-based large aerostat volume real-time monitoring method according to claim 1, wherein the removing of noise in the three-dimensional reconstruction generation point cloud is performed from two angles of parallax optimization and point cloud optimization;

the parallax optimization method comprises the following steps:

|d _l -d _r |＜t

obtaining error matching points by carrying out consistency detection on the parallax map, searching pixel points with non-zero parallax values to the left and right along polar lines for each error matching point respectively, and searchingThe found left and right parallax values are respectively marked as d ₁ And d ₂ ；

7. The binocular vision-based large aerostat volume real-time monitoring method according to claim 1, wherein the calculation expression for obtaining the depth information corresponding to each pixel of the parallax image is as follows:

wherein: s is S ₁ ,S ₂ Respectively is upper partThe area corresponding to the lower slice; Δh is the distance between adjacent slices.

8. The binocular vision-based large aerostat volume real-time monitoring method is characterized in that the point cloud volume solution is calculated in real time by constructing a CPU+GPU heterogeneous computing platform as computing resources and utilizing a CUDA parallel programming mode, wherein the CPU in the CPU+GPU heterogeneous computing platform is used as an input and output of data and a manager for memory resource allocation, and the GPU is used as a coprocessor to finish the calculation of the data transmitted by the CPU; and when CUDA is programmed, a python language is adopted to write and execute a kernel function at the GPU end and a control logic code at the CPU end, the CPU end code is compiled and executed by a python compiler in the execution process, and the kernel function is compiled and executed at the GPU end by an nvcc compiler.