CN113792593A

CN113792593A - Underwater close-range target identification and tracking method and system based on depth fusion

Info

Publication number: CN113792593A
Application number: CN202110911151.2A
Authority: CN
Inventors: 舒雯雯
Original assignee: Kunshan Tailanhe Robot Technology Co ltd
Current assignee: Kunshan Tailanhe Robot Technology Co ltd
Priority date: 2021-08-06
Filing date: 2021-08-06
Publication date: 2021-12-14

Abstract

The underwater close-range target recognition and tracking method and system based on depth fusion shoot images of underwater close-range targets by utilizing a binocular camera; reconstructing three-dimensional point cloud data of the underwater close-range target based on the image and a binocular parallax matching algorithm; utilizing a clustering algorithm to segment the three-dimensional point cloud data of the underwater close-range target to obtain a candidate region of the underwater close-range target; and classifying the targets in the underwater short-distance target candidate area by using a classification network to obtain the class labels of the underwater short-distance targets, and identifying and tracking the underwater short-distance targets according to the class labels of the underwater short-distance targets. The system can solve the problems of large system size, high power consumption, low target identification precision and speed and the like of the conventional underwater robot image target identification and tracking system, and can be used in small spherical underwater robots in offshore, shallow and narrow water.

Description

Underwater close-range target identification and tracking method and system based on depth fusion

Technical Field

The invention belongs to the technical field of underwater target identification, and particularly relates to an underwater close-range target identification and tracking method and system based on depth fusion.

Background

Compared with a large AUV, the spherical underwater robot has high flexibility and strong adaptability to shallow water narrow space environments, and is an important tool for underwater operation in offshore and shallow water environments. But the target recognition and tracking system is limited by the volume and the structural design, has weaknesses in the aspects of power consumption, carrying capacity, the number of sensors, processor performance and the like, and cannot be applied to a spherical underwater robot. The existing target recognition and tracking system of the spherical underwater robot mostly adopts a technical scheme based on sonar detection or marker detection, and the technologies have the defects of large volume, high power consumption, poor environment adaptability and the like, so that the underwater short-distance target recognition requirement of the spherical underwater robot is difficult to meet. The invention aims to solve the problems of large volume, high power consumption and poor environmental adaptability of the existing spherical underwater robot target identification system, reduce the volume and the power consumption of the system and improve the adaptability of the robot to the underwater environment.

From the document "Schechner Y, Karpel n. clear Underwater Vision [ C ]. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern recognition. IEEE, 2004", it is known that when a robot performs an Underwater work task, it is often necessary to acquire the coordinate position of an object in real time, and the realization of this function depends on an Underwater object tracking system. Therefore, the underwater target tracking system has important significance for realizing the applications of robot navigation positioning, formation cooperation, visual servo and the like. The literature "Isbitiren G, Akan O B.three-Dimensional underserver Target Tracking With Acoustic Sensor Networks [ J ]. IEEE Transactions on Vehicular Technology,2011,60(8): 3897-. The system is mainly applied to target detection and tracking of large-scale underwater vehicles, and cannot meet the requirements of small volume, low quality and low power consumption of spherical underwater robots, and the prior documents refer to Paull L, Saeedi S, Seto M, et al. AUV Navigation and Localization A Review [ J ]. IEEE Journal of scientific Engineering,2014,39(1): 131-. A target Tracking system based on wireless markers needs to implant markers, such as fluorescent, Underwater or laser markers, into a target object, and has a limited application scenario, which cannot meet the requirement of a spherical Underwater robot for operation in an unknown water environment, and is disclosed in the prior documents "Delcourt J, YLieff M, Bolliet V, et al. video Tracking in the extreme: A new accessibility for Tracking non-conventional Underwater vehicle Tracking with flexible electronic tags [ J ]. beta. Research Methods,2011,43(2):590 & 600", "texture T M, dobaro J, Eiler J.collecting, interference, and combining mapping in a real parameter V, motion Tracking [ C.2010, sound mapping [ E.E. ] and simulation [ E.E.J. ] R.R.R.R.R.R.R.R.R.: R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.C. 1. and R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.C. R.R.R.R.R.R.R.R.R.C. R.R.R.R.R.R.R.R.R.R.R.C. R.R.R.C. R.R.R.R.R.C. R.R.R.C. R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.C. R.C. R.R.C. No. 9. A. R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.C.C. R.R.C.R.R.R.R.R.S. R.C. R.R.C. R.R.R.R.R.R.C. R.R.R.C. No. R.R.R.R.R.R.C.R.R.R.R.R.R.R.R.R.R.R.R.R.S. R.R.R.S. R.R.R.R.R.R.R.R.S. R.S. R.R.R.R.C.C. R.R.S. R.S. R.D. R.S. R.C. R.S. R.C. R.S. R.C.C.C.S. R.S. R.C. R.S. R.D. R.S. R.C. R.S. R.D. R.C. R.D. R.C.C. R.S. R.C.C. R.D. R.S. R.C.C.C.C.C.S. R.C.C. 9. and R.S.C. R.C, 2016,16(4):429-454". The target tracking system based on the laser radar scans in a space environment by utilizing laser beams, and constructs a surface profile of a space object by a laser ranging principle to identify and finally obtain the target object. The laser radar is widely applied to a ground robot platform at present, and the reference document is 'segment building, Zhengkawa and Zhongjing', the environmental perception of the multilayer laser radar in an unmanned vehicle [ J ]. the university of Beijing industry, 2014,40(12): 1891-. As can be seen from the contents described in the document "Schechner Y, Karpel n. clear Underwater Vision [ C ]. Computer Vision and Pattern recognition. ieee, 2004", an Underwater target tracking system based on images of visible light captures Underwater images using a visible light camera, and then identifies and tracks a target object in the images using a Computer Vision algorithm. However, due to the influence of an underwater environment, the problems of color distortion, underexposure, characteristic blurring and the like generally exist in a long-distance and large-range underwater image, so that the underwater target tracking system based on visible light is less applied to a large-scale underwater robot platform.

The spherical underwater robot is different from a large-scale underwater robot taking ocean and deep sea as main application scenes, the spherical underwater robot takes offshore shallow water environment and narrow space as main application scenes, the water quality of the spherical underwater robot in the links is better, the requirement on the detection distance of an underwater target tracking system is not high, and a near target is mainly used. In combination with the requirements of spherical Underwater robots on miniaturization and low power consumption of Underwater target tracking Systems, the Underwater target tracking system based on visible light becomes a preferred scheme for spherical Underwater robots to know Underwater environment, and the reference is "Soni O K, Kumare J s.a surface on underground water Images engineering Technologies [ C ].2020 IEEE 9th International Conference Communication Systems and Network Technologies (CSNT). The group Bot spherical underwater robot and the Salamander miniature spherical underwater robot are provided with monocular cameras on the robots for underwater image acquisition and can transmit image data back in real time, but target objects in the images are not further identified and tracked through a computer vision algorithm. The document "Guo S, Pan S, Shi L, et al. visual Detection and Tracking System for a Spherical Amphiius Robot [ J ]. Sensors,2017,17(4): 870-890" describes that an ASR-III Spherical underwater Robot carries a ToF (time of flight) camera to acquire RGB-D images of an underwater environment, and an Xilinx Zynq-7000 SoC processor is selected to construct an embedded underwater target Tracking System, and the System can realize real-time underwater target Tracking of 20fps, but the System needs to manually select an initial target position and cannot identify deep-level target information such as object types, so that the System faces various limitations in actual use scenes. Because the ToF camera is susceptible to the influence of impurities and water body scattering, the ASR-IV spherical underwater robot upgrades the ToF camera to a binocular camera, obtains a parallax image by using a stereoscopic vision matching algorithm, measures the distance of each pixel point in the image, and completes the underwater three-dimensional positioning and obstacle avoidance of the robot by using the HOG characteristics and the SVM classifier.

According to the document 'Panshawu-bionic amphibious spherical robot embedded real-time image target tracking system [ D ]. Beijing university of science and engineering, 2018', aiming at the target tracking requirement of a spherical underwater robot in an amphibious environment, a set of RGB-D image acquisition system adopting a ToF camera is built, and an RGB-D target tracking algorithm based on tracking result fusion is provided. The algorithm collects the visible light image and the depth image at the same time, simultaneously carries out target tracking on the two images and realizes the fusion of tracking results by utilizing the extended Kalman filtering. The method comprises the steps of performing target tracking on a visible light image by adopting a CT algorithm, using the CT algorithm as a main tracker, selecting the maximum category of a histogram on a depth image by utilizing a clustering algorithm as a target area, and constructing an auxiliary tracker by adopting VR-V (variance Ratio Features Shift). The test experiments of the algorithm in the depth image data set and the actual underwater environment show that the algorithm can successfully realize target tracking in the water environment, and has certain advantages in the aspects of shielding resistance and calculation real-time performance. However, this algorithm clusters the depth histogram by using a clustering algorithm, and only uses the maximum class of the histogram as a target region, and cannot separate a plurality of targets from the background, so that multi-target tracking cannot be completed. In addition, the tracking process only judges the similarity of the candidate areas of the adjacent frames, and information such as the target type cannot be obtained. Therefore, the algorithm is difficult to meet the requirements of the spherical underwater robot on multi-target recognition and tracking in the water environment.

The document "Liu Yu. bionic spherical underwater robot path planning and obstacle avoidance research [ D ]. Beijing: Beijing university of science and engineering 2020" adopts a binocular camera to acquire stereoscopic vision information of the surrounding environment of the robot, combines with a parallax image processing means, and proposes to detect obstacles based on a parallax image optimization and three-dimensional space information fusion mechanism. The obstacle detection can be similar to target detection, the algorithm obtains a disparity map through binocular image stereo matching, carries out preprocessing such as closing operation, opening operation and multi-level filtering on the disparity map to obtain a relatively complete obstacle disparity image, and finally carries out contour extraction and binarization operation on the disparity image and screens out a target obstacle. Experimental tests in a land environment and an underwater environment show that the algorithm can realize target obstacle identification within the range of 0.2m-1.5m in an embedded environment. However, the algorithm still cannot meet the target identification and tracking requirements of the spherical underwater robot in a complex environment. Firstly, the algorithm only relies on binocular camera depth images for obstacle recognition, and does not use RGB images, so that a large amount of detail information is lost. Secondly, the algorithm performs obstacle segmentation by extracting the outline of the obstacle, but the outline extraction algorithm is seriously dependent on the image quality and is seriously influenced by interference factors such as illumination, imaging definition, background and the like. Thirdly, after contour extraction is carried out on the depth image, contours generated by interference signals are removed through sorting and screening according to the size of the contours, but the method is easy to interfere and low in reliability when multi-obstacle identification is carried out.

Therefore, the current target recognition and tracking system of the spherical underwater robot has the following problems:

(1) the system has larger volume, higher power consumption, poor environmental adaptability and poor use effect on small robot platforms such as spherical underwater robots and the like, and can not meet the use requirements under offshore, shallow and narrow underwater environments.

(2) The accuracy and speed of target recognition is low. An underwater target recognition and tracking system based on a visible light image generally adopts an imaging scheme of a monocular camera, cannot obtain space three-dimensional coordinate information of a target, and can only recognize the target depending on image characteristics. The scheme is easily interfered by impurities in water, light scattering and other factors, and error identification is easy to occur. And a large amount of calculation is needed in the target retrieval and image feature identification processes, so that the identification speed is low.

Aiming at the problems, the underwater close-distance target recognition and tracking system based on depth fusion is set up. Firstly, a target is shot by using a binocular camera, and spatial three-dimensional coordinate point cloud of a target object is reconstructed by using a binocular parallax matching algorithm on the basis of a calibration image. And then, carrying out clustering segmentation on the coordinate point cloud by using a mean shift clustering algorithm, segmenting a target object from the background and obtaining a target area candidate frame. And finally, classifying the objects in the target area candidate frame by using a classification network to obtain class labels of the objects, so as to realize the identification and tracking of the close-range target.

Disclosure of Invention

The invention overcomes one of the defects of the prior art, provides an underwater short-distance target recognition and tracking method based on depth fusion, solves the problems of larger system size, higher power consumption, lower target recognition precision and speed and the like of the conventional underwater robot image target recognition and tracking system, and can be used in small spherical underwater robots in offshore, shallow and narrow water.

According to one aspect of the disclosure, the invention provides an underwater close-range target recognition and tracking method based on depth fusion, which includes:

shooting an image of an underwater short-distance target by using a binocular camera;

reconstructing three-dimensional point cloud data of the underwater close-range target based on the image and a binocular parallax matching algorithm;

utilizing a clustering algorithm to segment the three-dimensional point cloud data of the underwater close-range target to obtain a candidate region of the underwater close-range target;

and classifying the targets in the underwater short-distance target candidate area by using a classification network to obtain the class labels of the underwater short-distance targets, and identifying and tracking the underwater short-distance targets according to the class labels of the underwater short-distance targets.

In one possible implementation, the capturing an image of an underwater close-distance target by using a binocular camera includes:

and shooting the underwater short-distance target by using two relatively fixed cameras positioned at the same horizontal position to obtain two images of the underwater short-distance target.

In one possible implementation, the reconstructing three-dimensional point cloud data of the underwater close-range target based on the image and the binocular disparity matching algorithm includes:

and constructing a disparity map of the underwater close-range target according to the position difference of the underwater close-range target in the two images, and performing three-dimensional reconstruction on pixel points of the underwater close-range target in the two images by using a binocular disparity matching algorithm based on the disparity map of the underwater close-range target to obtain three-dimensional point cloud data of the underwater close-range target.

In one possible implementation, the clustering algorithm is a density-based Mean shift clustering algorithm.

In a possible implementation manner, the segmenting the three-dimensional point cloud data of the underwater close-range target by using a clustering algorithm to obtain the candidate region of the underwater close-range target includes:

by introducing a kernel function G_H(x) Probability density gradient function M for calculating Mean shift clustering algorithm_H(x) And clustering and dividing each three-dimensional point cloud data of each underwater close-range target along the probability density gradient direction of the three-dimensional point cloud data to obtain the candidate region of the underwater close-range target.

According to another aspect of the present disclosure, an underwater close-range target recognition and tracking system based on depth fusion is provided, the system comprising:

the acquisition module is used for shooting an image of an underwater short-distance target by using a binocular camera;

the reconstruction module is used for reconstructing three-dimensional point cloud data of the underwater close-range target based on the image and a binocular parallax matching algorithm;

the segmentation module is used for segmenting the three-dimensional point cloud data of the underwater close-range target by utilizing a clustering algorithm to obtain a candidate region of the underwater close-range target;

and the tracking module is used for classifying the targets in the underwater short-distance target candidate area by utilizing a classification network to obtain the class labels of the underwater short-distance targets and realizing the identification and tracking of the underwater short-distance targets according to the class labels of the underwater short-distance targets.

The underwater close-range target recognition and tracking method based on depth fusion comprises the steps of shooting images of underwater close-range targets by utilizing a binocular camera; reconstructing three-dimensional point cloud data of the underwater close-range target based on the image and a binocular parallax matching algorithm; utilizing a clustering algorithm to segment the three-dimensional point cloud data of the underwater close-range target to obtain a candidate region of the underwater close-range target; and classifying the targets in the underwater short-distance target candidate area by using a classification network to obtain the class labels of the underwater short-distance targets, and identifying and tracking the underwater short-distance targets according to the class labels of the underwater short-distance targets. The system can solve the problems of large system size, high power consumption, low target identification precision and speed and the like of the conventional underwater robot image target identification and tracking system, and can be used in small spherical underwater robots in offshore, shallow and narrow water.

Drawings

The accompanying drawings are included to provide a further understanding of the technology or prior art of the present application and are incorporated in and constitute a part of this specification. The drawings expressing the embodiments of the present application are used for explaining the technical solutions of the present application, and should not be construed as limiting the technical solutions of the present application.

FIG. 1 shows a schematic view of a spherical underwater robotic platform according to an embodiment of the present disclosure;

FIG. 2 shows a schematic diagram of a depth fusion based underwater close-range target recognition and tracking system framework according to an embodiment of the present disclosure;

FIG. 3 shows a flowchart of a depth fusion based underwater close-range target recognition and tracking method according to an embodiment of the present disclosure;

FIG. 4 shows a pinhole camera imaging model schematic diagram in accordance with an embodiment of the present disclosure;

fig. 5 shows a schematic view of a binocular camera spatial reconstruction model according to an embodiment of the present disclosure.

Detailed Description

The following detailed description of the embodiments of the present invention will be provided with reference to the accompanying drawings and examples, so that how to apply the technical means to solve the technical problems and achieve the corresponding technical effects can be fully understood and implemented. The embodiments and the features of the embodiments can be combined without conflict, and the technical solutions formed are all within the scope of the present invention.

Additionally, the steps illustrated in the flow charts of the figures may be performed in a computer such as a set of computer-executable instructions. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.

FIG. 1 shows a schematic view of a spherical underwater robotic platform according to an embodiment of the present disclosure; FIG. 2 shows a schematic diagram of a depth fusion-based underwater close-range target recognition and tracking system framework according to an embodiment of the disclosure.

As shown in fig. 1, the spherical underwater robot platform comprises a water inlet cabin, a water inlet, an underwater acoustic communication module, a binocular camera, a waterproof cabin, a driving steering engine, a battery cabin, a composite driver and a leg-type proper water jet propeller.

As shown in fig. 2, in the system for recognizing and tracking the depth-fused target of the spherical underwater robot, first, a target image (a left image and a right image in fig. 2) is acquired by using a binocular camera of the spherical underwater robot platform in fig. 1, the target image is subjected to image calibration, and then, three-dimensional point cloud data of a target object in a space is reconstructed by using a binocular stereo matching algorithm and a parallax image method. And clustering and dividing all the three-dimensional point clouds by using a mean shift clustering algorithm, dividing each target object and the background into each target candidate region due to relatively independent space coordinate positions of different target objects, and finally classifying each target candidate region by using a classification network to obtain a class label of the object. Therefore, a set of complete target recognition and tracking system suitable for the embedded environment and an underwater close-distance target recognition and tracking method based on the target recognition and tracking system are constructed.

Fig. 3 shows a flowchart of an underwater close-range target recognition and tracking method based on depth fusion according to an embodiment of the present disclosure.

As shown in fig. 3, the method may include:

step S1: and shooting the image of the underwater short-distance target by using a binocular camera.

The binocular camera may be placed in two relatively fixed cameras at the same horizontal position. Two relatively fixed binocular cameras located at the same horizontal position are used for shooting the underwater close-distance target, and two images of the underwater close-distance target are obtained according to the binocular stereoscopic vision imaging principle.

The binocular stereoscopic vision imaging principle is that two cameras with relatively fixed positions are used for shooting the same scene, the position difference of the same point in the scene in two images is calculated so as to construct a disparity map, and then the disparity map is used for carrying out three-dimensional reconstruction on pixel points in a picture to obtain three-dimensional position information in the scene. The need for binocular stereoscopic imaging principles starts with common monocular cameras. The monocular camera imaging model comprises a world coordinate system, a camera coordinate system, an image coordinate system and a pixel coordinate system. The world coordinate system is used to represent the coordinate position of an object in the real world. The camera coordinate system is used for representing the coordinate position of an object with the camera as an origin in the real world, and the Z axis (Z)_c) Parallel to the monocular camera optical axis and passes through the optical center. The image coordinate system (x, y) represents object coordinates using real-world distance units with the monocular camera imaging sensor array center as the origin. The pixel coordinate system (u, v) is based on the upper left corner of the imaging sensor array and is used for representing the coordinates of the object in the image in pixel units. The imaging process of the monocular camera can be seen as the object is sequentially transformed into the pixel coordinate system under the world coordinate system.

First, the image coordinate system is different from the pixel coordinate system in that the image coordinate system is in real world distance units (such as mm) and has an origin at the center of the imaging sensor array, and the image coordinate system is in rows and columns where the pixels are located and has an origin at the upper left corner of the imaging sensor array. The conversion relationship between the image coordinate system of the target object and the pixel coordinate system is obtained as follows:

wherein (u)₀,v₀) Representing the coordinate position of the imaging sensor array centre point O in the pixel coordinate system. Assume a single of xThe bit is (mm), then dx has the unit of (mm/pixel), i.e., the size of the image of the target object represented by each pixel through the lens at the imaging sensor array plane. Equation (1) is expressed in rectangular form of homogeneous coordinates as:

the target object is transformed from the world coordinate system to the camera coordinate system only by performing translation and rotation transformation on the object without changing the size and the shape of the object, so that the transformation from the world coordinate system to the camera coordinate system belongs to rigid body transformation. The rigid body transformation can be represented by the formula:

for convenience of calculation, the above formula is converted into a homogeneous expression:

in equation (4), R is a 3 × 3 orthogonal unit matrix and represents a rotation change of the target object, t represents a translation amount of the coordinate origin in three dimensions, and R, t is also referred to as an external parameter of the monocular camera since R, t is related to only the placement position of the monocular camera and is not related to the configuration of the monocular camera imaging system.

FIG. 4 shows a pinhole camera imaging model schematic according to an embodiment of the present disclosure.

When an object is transformed from a camera coordinate system to an image coordinate system, the size of a lens of a monocular camera is generally very small relative to the size of the object in the real world, and the process of imaging by the monocular camera can be regarded as pinhole imaging. As shown in FIG. 4, π is the image plane of the monocular camera, O_cThe optical center of the monocular camera is adopted, and in order to facilitate analysis, the image plane pi is symmetrical from the rear of the optical center to the front of the optical center in the model. f is the focal length of the camera. z is a radical of_cThe axis is the main axis of the camera and is perpendicular to the image plane pi, and the intersection point p is called the principal point of the camera.

Suppose an object in space is located at point X_cThe coordinates in homogeneous form in the monocular camera coordinate system may be expressed as:

X_c＝[x_c y_c z_c 1]^Tthe compound of the formula (5),

the homogeneous coordinate of the imaged image point in the image coordinate system can be expressed as:

m＝[x y 1]^Tthe compound of the formula (6),

from the triangle similarity relationship, it can be derived:

the above formula is represented in matrix as follows:

through the analysis, a coordinate transformation formula of each link in the imaging process of the common monocular camera is obtained. In order to obtain the position coordinates of an object in the world coordinate system to the position coordinates of the pixel coordinate system in the final image, the final coordinate transformation formula (9) is as follows:

in the above equation (9), the transformation matrix P from world coordinates to image coordinates is defined as:

the image of the underwater short-distance target can be obtained by the binocular stereo vision imaging principle through the change.

Step S2: and reconstructing the three-dimensional point cloud data of the underwater close-range target based on the image and the binocular parallax matching algorithm.

Through the analysis of the imaging principle of the ordinary monocular camera in step S1, it can be found that the imaging process of the monocular camera is a process of transforming the object from the three-dimensional world coordinate system to the two-dimensional pixel coordinate system through projection. Since the three-dimensional coordinates are compressed to two-dimensional coordinates in the projection process, the spatial three-dimensional coordinates of the object cannot be measured by the monocular camera. In order to solve the problem, the binocular stereo vision imaging shoots a target object through two cameras at the same time, and three-dimensional space coordinates of the target are restored through a disparity map.

In an example, a disparity map of the underwater close-range target can be constructed according to the position difference of the underwater close-range target in two images, and based on the disparity map of the underwater close-range target, a binocular disparity matching algorithm is utilized to perform three-dimensional reconstruction on pixel points of the underwater close-range target in the two images so as to obtain three-dimensional point cloud data of the underwater close-range target.

As shown in FIG. 5, there are two cameras c located on the same horizontal phase₁And c₂Optical center of_c1And O_c2D, main optical axis Z_c1And Z_c2Are parallel to each other. Suppose something X in space_cThe coordinate in the world coordinate system is (x)_c,y_c,z_c) The images of which are respectively located at the point m₁And m₂Can be respectively expressed as (x) in an image coordinate system_left,y_left) And (x)_right,y_right) Since the two cameras are mounted at the same level, y_left＝y_rightY, we can derive:

defining disparity as disparity ═ x_left-x_rightAnd solving the above formula to obtain:

can be calculated from the point m according to the above formula₁And m₂Reconstructing an object point X from the coordinates_cThree-dimensional space coordinates.

Step S3: and segmenting the three-dimensional point cloud data of the underwater close-range target by utilizing a clustering algorithm to obtain the candidate region of the underwater close-range target.

The clustering algorithm can be a Mean shift clustering algorithm based on density, belongs to one of unsupervised learning, and can be applied to computer vision applications such as image segmentation and target tracking. The method can be used for completing target region segmentation in a binocular scene image based on a Mean shift clustering algorithm, and has the core idea that data sets of different categories (clusters) are considered to accord with different probability density distributions, so that the direction in which the probability density is increased fastest, namely the probability density gradient direction, is found for each data point, then the data points are made to follow the probability density gradient direction until all the data points converge at a certain fixed point, the data points converging at the same fixed point belong to the same category (cluster), and the number of the converging points represents the number of the categories (cluster).

In an example, completing the segmentation of the target region in the binocular scene image based on the Mean shift clustering algorithm may include: by introducing a kernel function G_H(x) Probability density gradient function M for calculating Mean shift clustering algorithm_H(x) And clustering and partitioning the three-dimensional point cloud data of each underwater close-range target along the probability density gradient direction of the three-dimensional point cloud data to obtain an underwater close-range target candidate region.

For example, assuming that point x is a sample point in a d-dimensional vector space, the basic form of the Mean shift vector defining sample point x is:

S_h＝(y|(y-x)(y-x)^T<h²) In the formula (14),

in the above formula, S_hThe representation is a spherical area of radius h centered on the sample point x in d-dimensional space, where each point in the above formula is located at S_hThe point in the spherical area is subtracted from the sample point and the Mean value is calculated, and the Mean shift vector M is easily obtained_h(x) Pointing to the direction in which the data points are most dense, i.e. the direction in which the probability density is greatest. The core of the Mean shift clustering algorithm firstly calculates Mean shift vectors of all data points within a certain radius range, and then adds the Mean shift vectors of all the data points to the Mean shift vectors of all the data points, so that each data point moves a distance towards the gradient direction of the probability density of each data point. And repeating the above process continuously, and continuously moving all the data points in the direction of the probability density gradient until the distance of movement of all the data points is less than a certain set threshold, at which point the data points can be considered to have converged. And finally, classifying the data points with close distances into one class, and calculating the central point of each class, thereby completing the whole process of the Mean shift algorithm. The Mean shift clustering algorithm completes the automatic classification of the data set in an unsupervised state.

In the Mean shift clustering algorithm general form, all data points in the spherical region contribute the same amount to the Mean shift vector of sample point x, however this is not reasonable in many applications. In most cases, a more reasonable calculation method would be to have data points closer to the sample point contribute more to the Mean shift vector, and data points further from the sample point contribute less to the Mean shift vector. The Mean shift clustering algorithm with the kernel function is produced in order to solve the problems in the general form of the Mean shift clustering algorithm.

The biggest difference of Mean shift clustering algorithm with kernel function is that kernel function computing Mean shift vector is introduced, which can be represented by the following formula:

wherein:

G_H(x) Is a kernel function, which may be, for example, a gaussian kernel function or the like. H is a symmetric matrix (bandwidth matrix) of d × d, which can be described as:

therefore, the Mean shift vector of the Mean shift clustering algorithm with kernel function can be rewritten as follows:

in equation (18) above, the kernel weight for each data point is greater than 0 and the sum is 1, so M_h(x) Is a normalized probability density gradient function. And the segmentation and identification of the target object are completed by clustering and segmenting the point cloud of the surface coordinate of the target object.

Step S4: and classifying the targets in the underwater short-distance target candidate area by using a classification network to obtain the class labels of the underwater short-distance targets, and identifying and tracking the underwater short-distance targets according to the class labels of the underwater short-distance targets. For example, an SVM classifier may be used to classify targets in the underwater short-distance target candidate region, complete the identification of the target object, and obtain a class label, which is not limited herein.

According to another aspect of the present disclosure, the present disclosure also provides an underwater close-range target recognition and tracking system based on depth fusion, which may include:

The underwater close-range target recognition and tracking method and system based on depth fusion can solve the problems that an existing image target recognition and tracking system of a spherical underwater robot is large in system size, high in power consumption, low in target recognition accuracy and speed and the like, and are not beneficial to use of the spherical underwater robot in offshore, shallow and narrow underwater environments. The method has the advantages of small volume, low power consumption, higher identification precision, high identification speed, capability of improving the performance of target identification of the spherical underwater robot, and the like, and can be used on a small spherical underwater robot platform.

Although the embodiments of the present invention have been described above, the above descriptions are only for the convenience of understanding the present invention, and are not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. An underwater close-range target recognition and tracking method based on depth fusion is characterized by comprising the following steps:

2. The underwater close-range target recognition and tracking method according to claim 1, wherein the capturing of the image of the underwater close-range target by using the binocular camera comprises:

3. The underwater close-range target recognition and tracking method according to claim 2, wherein the reconstructing three-dimensional point cloud data of the underwater close-range target based on the image and a binocular disparity matching algorithm comprises:

4. The underwater close-range target recognition and tracking method according to claim 1, characterized in that the clustering algorithm is a density-based Mean shift clustering algorithm.

5. The method for identifying and tracking the underwater close-range target according to claim 4, wherein the segmenting the three-dimensional point cloud data of the underwater close-range target by using a clustering algorithm to obtain the candidate region of the underwater close-range target comprises:

6. An underwater close-range target recognition and tracking system based on depth fusion, which is characterized by comprising: