CN116681721A

CN116681721A - Linear track detection and tracking method based on vision

Info

Publication number: CN116681721A
Application number: CN202310669655.7A
Authority: CN
Inventors: 李俊; 冯云剑
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2023-06-07
Filing date: 2023-06-07
Publication date: 2023-09-01
Anticipated expiration: 2043-06-07
Also published as: CN116681721B

Abstract

The invention provides a linear track detection and tracking method based on vision, which comprises the following steps: firstly, shooting a linear track image by using a visual image acquisition device; carrying out image enhancement on the linear track, extracting gradient characteristic information, and identifying the edge of each linear track according to the gradient amplitude value to obtain an edge map; eliminating perspective effect generated in the image shooting process by adopting an inverse perspective transformation method on the edge map to obtain a transformed edge map; detecting the edges of the transformed linear tracks by adopting a sliding window method, screening out the edges of the linear tracks meeting the requirements, and finally fitting to obtain an equation of the straight line where the edges of the linear tracks are located; and tracking each linear track meeting the requirements detected in the continuously shot images by adopting a Kalman filter, and adaptively adjusting the Kalman gain. The invention can detect and track single or multiple parallel thin or thick continuous or discontinuous straight tracks under complex working conditions.

Description

Linear track detection and tracking method based on vision

Technical Field

The invention belongs to the field of computer vision and image processing, and particularly relates to a linear track detection and tracking method based on vision.

Background

In applications such as vehicle assisted driving and automatic driving, robot line inspection, etc., detection and tracking of a straight line track are one of key core technologies. Since most of the current linear tracks are sprayed or printed on the ground through colored paint, only a visual image acquisition device can be used for acquiring images of the linear tracks, and the images are detected and tracked through processing and analysis. The existing linear track detection method based on color is easy to be influenced by illumination and weather conditions, and has poor detection and tracking robustness in complex environments. On the basis of the linear detection method, the conditions of linear track interruption, pollution, abrasion and the like cannot be processed, so that false detection and omission detection are caused.

With the development of deep learning technology, a linear track detection method based on deep learning is continuously emerging. But such methods generally rely on large-scale image datasets with authentic labels for training. However, for many special application scenarios, such as ports, factories, etc., the cost of acquiring and producing a straight-line trajectory detection image dataset containing a variety of weather and lighting conditions is very high, which greatly limits the application of deep learning-based methods. In addition, such methods have limited generalization ability, and models trained on a particular scene data set cannot be applied to other scenes.

In the context of industrial automation and intellectualization, many application scenarios put higher demands on the accuracy of linear track detection, even up to the order of tens of micrometers (e.g. seam tracking). This requires that the detection algorithm of the linear track has stronger feature extraction detection capability, and can screen out the false detection result by using the environmental information. Moreover, since the application scene is varied, the detection algorithm must be able to adaptively adjust according to factors such as illumination, weather, and the state of the marking line, so as to improve the robustness of detection and tracking.

In summary, the existing linear track detection method based on image processing is poor in accuracy and robustness, and is difficult to meet the use requirements of complex scenes; the linear track detection method based on deep learning relies on a large-scale data set, has poor generalization capability, and limits the application of the linear track detection method in actual scenes. Aiming at the problems, the invention provides a visual-based linear track detection and tracking method which can realize the detection and tracking of single, double and multiple continuous or discontinuous linear tracks under complex weather and working conditions.

Disclosure of Invention

In order to solve the above problems, the present invention provides a method for detecting and tracking a linear track based on vision, which includes the following steps:

step S1: the visual image acquisition device shoots a linear track image;

step S2: enhancing the linear track image shot in the step S1, extracting gradient characteristic information, and identifying the edge of each linear track according to gradient amplitude information to obtain an edge map;

step S3: eliminating perspective effect generated in the image shooting process by adopting an inverse perspective transformation method on the edge map obtained in the step S2, and obtaining a transformed edge map;

step S4: detecting the converted linear track edges obtained in the step S3 by adopting a sliding window method, screening out the linear track edges meeting the requirements, and finally fitting to obtain an equation of the line where the linear track edges meeting the requirements are located;

step S5: and tracking each linear track meeting the requirements detected in the continuously shot images by adopting a Kalman filter, and adaptively adjusting the Kalman gain according to the detection confidence of the edge of the linear track.

Further, the step S1 specifically includes the following steps:

step 1-1: the visual image acquisition device comprises, but is not limited to, a monitoring camera, an industrial camera and a network camera;

Step 1-2: the visual image acquisition device is deployed at a proper position of a vehicle or a mobile robot so as to ensure that a linear track to be detected and tracked is in the visual field range of the visual image acquisition device, the optical axis of the visual image acquisition device is perpendicular to the ground or inclined to the ground at an angle gamma, gamma is 0 DEG and 90 DEG, and the visual image acquisition device is used for shooting images of the linear track.

Further, the step S2 specifically includes the following steps:

step 2-1: taking the linear track images shot by the visual image acquisition device as input one by one, and dividing an interested region where the linear track is positioned in the input image by using a rectangular frame with a fixed size;

step 2-2: carrying out graying treatment on the image in the region of interest;

step 2-3: carrying out image enhancement on the image subjected to the graying treatment;

step 2-4: extracting gradient information G in horizontal and vertical directions of the enhanced image by Sobel operator _x and G_y ：

Wherein, the convolution operation is represented, I represents the region of interest image which is subjected to gray processing and image enhancement;

step 2-5: the gradient magnitude GM for each pixel is calculated according to:

GM(x,y)＝|G _x (x,y)|

wherein G_x (x, y) represents gradient information G of the pixel at the coordinates (x, y) _x ；

Step 2-6: setting a gradient amplitude threshold T _M Gradient amplitude GM in the region of interest image is greater than or equal to threshold T _M The pixel points of (a) are edge pixel points, and an edge graph E is obtained:

further, the step S3 specifically includes the following steps:

step 3-1: the inverse perspective transformation process of the edge map obtained in the step S2 is as follows:

wherein A_3×3 Representing an inverse perspective transformation matrix [ u v 1 ]] ^T Sitting representing pixel points in original imageNominal, called source point coordinates, [ x y 1 ]] ^T Representing transformed pixel point coordinates, called target point coordinates, and obtaining at least three sets of non-collinear corresponding source point and target point coordinates to solve the inverse perspective transformation matrix A _3×3 ；

Step 3-2: during initialization, for a single thin linear track, selecting intersection points of the edges of the linear track and the upper and lower boundaries of the region of interest in the image in a manual selection mode, and calculating two points of which the two intersection points are symmetrical with respect to the central line of the edge map E in the horizontal direction to serve as four source points;

for a single thick straight line track, selecting intersection points of the left and right edges of the straight line track and the upper and lower boundaries of the region of interest in an image in a manual selection mode, and taking the intersection points as four source points;

for a plurality of thin linear tracks which are parallel to each other, selecting intersection points of the edges of the first linear track and the last linear track from left and the upper and lower boundaries of the region of interest in the image in a manual selection mode as four source points;

For a plurality of thick straight-line tracks which are parallel to each other, selecting intersection points of the left side edge of a first straight-line track from left side and the right side edge of a last straight-line track with the upper and lower boundaries of the region of interest in an image in a manual selection mode as four source points;

respectively named LB, LU, RU and RB in anticlockwise order from the source point in the lower left corner, and their coordinates are [ u ] _LB v _LB 1] ^T 、[u _LU v _LU 1] ^T 、[u _RU v _RU 1] ^T and [u_RB v _RB 1] ^T ；

Step 3-3: the coordinates of the target points corresponding to the four source points are respectively

Step 3-4: obtaining an inverse perspective transformation matrix A in the step 3-1 according to the coordinates of the source point and the target point obtained in the steps 3-2 and 3-3 _3×3 Combining the homogeneous coordinates of all pixel points in the edge map obtained in the step S2 with A _3×3 Multiplying to obtain a transformed edge map;

step 3-5: for an image containing a plurality of mutually parallel linear tracks, the source point coordinates of the current frame are automatically selected according to the linear track detection result of the previous frame image, so that the self-adaptive adjustment of the inverse perspective transformation matrix is realized; under the influence of perspective effect, the straight line tracks which originally keep a parallel relationship intersect at vanishing points in the image; calculating to obtain intersection points, namely vanishing point coordinates, by using a linear equation of the edge of the linear track detected in the previous frame of image through a least square method; then for the thin linear track, connecting the vanishing point with the lower end points of the left-hand first linear track edge and the last linear track edge respectively, and for the thick linear track, connecting the vanishing point with the lower end points of the left-hand first linear track edge and the right-hand last linear track edge respectively; the intersection point of the two connecting lines and the upper and lower boundaries of the region of interest is a group of source points of the current frame image.

Further, the step S4 specifically includes the following steps:

step 4-1: projecting the edge map after the inverse perspective transformation to the transverse axis of the image respectively to obtain a corresponding statistical histogram;

step 4-2: average filtering is carried out on the statistical histogram in a one-dimensional convolution mode, noise in data is eliminated, and only T with the statistical value exceeding the maximum value of the statistical result is reserved _E % points;

step 4-3: after filtering processing, the abscissa corresponding to the peak point in each histogram represents the initial base point coordinate of the current edge;

step 4-4: searching along the longitudinal axis direction of the image by adopting a fixed window size h multiplied by w and a fixed step size s by taking an initial base point as a starting point, wherein h is the height of the window, w is the width of the window and is equal to the threshold value of the width divided by the thick straight line and the thin straight line in value, and if the number of edge pixel points in the window exceeds a set threshold value T _P The window is considered as an effective window, and the edge pixel points in the window are regarded as effective edge pixel pointsCalculating the average value of the abscissa of all edge pixel points in the window, so as to update the base point coordinates of the search window;

step 4-5: if the number of edge pixels in the window does not exceed the set threshold T _P The window is determined to be an invalid window, the coordinates of the base point are kept unchanged, and searching is continued along the longitudinal axis direction until the preset searching times N are reached;

step 4-6: for each linear track edge, if its corresponding number of active windows is greater than a threshold T _W The linear track edge is considered as a candidate edge, and for a single thin or thick linear track, the candidate edge is the linear track edge meeting the requirements;

for a plurality of parallel thin or thick linear tracks, after the edges of the linear tracks are matched, the distance constraint between the edges of the same pair of candidates is combined, and the candidates meeting the requirements are screened out, specifically comprising the steps of 4-7:

step 4-7: for a plurality of thin straight-line tracks parallel to each other, the t-th pair of edge maps Gr _t ＝<E _t ,E _t+1 >Sets of initial base point coordinates of all candidate edges in B respectively _t ,B _t+1 T=1, 2,..t-1, then the initial base point coordinates of the candidate edge must meet the following condition:

wherein D_t For a fixed distance between the T-th pair of linear track edges, T _t A distance deviation threshold value between the edges of the t-th pair of straight-line tracks; for the initial base point coordinate set B 'satisfying the above condition' _t ,B′ _t+1 T=1, 2,..t-1, effective initial base point coordinates are

The corresponding candidate edges are linear track edges meeting the requirements;

for a plurality of thick straight-line tracks parallel to each other, the t-th pair of edge maps and />The set of initial base coordinates of all candidate edges in (a) is respectively>The initial base point coordinates of the candidate edge must meet the following condition:

wherein and />Respectively the fixed distance between the left side edge and the right side edge of the T-th pair of linear tracks, T _t ^L and T_t ^R Respectively the distance deviation threshold value between the left side edge and the right side edge of the t-th pair of linear tracks; for an initial set of base point coordinates satisfying the above conditionsEffective initial base point coordinates are

step 4-8: and carrying out straight line fitting on the effective edge pixel points contained in each linear track edge meeting the requirements to obtain a linear equation corresponding to the edge.

Further, the specific method for pairing the linear track edges is as follows:

for a plurality of thin linear tracks parallel to each other, dividing the linear tracks into T linear tracks sequentially from left to right along the horizontal direction of the image, wherein T represents the number of the linear tracks, and the edge map E is stored as T edge maps E according to the linear tracks of the edges ₁ ,E ₂ ,...,E _T ：

Dividing adjacent two linear track edges into a pair Gr according to the relative position relation _t The dividing mode is as follows:

Gr _t ＝<E _t ,E _t+1 >,t＝1,2,...,T-1

for a thick straight line track in an input image, according to gradient direction information, dividing the edge of the straight line track into a left side, a right side and other three types, wherein the specific steps are as follows:

the gradient direction feature GD for each pixel is calculated according to the following equation:

wherein G_x (x, y) and G _y (x, y) represents gradient information G of the pixel at coordinates (x, y), respectively _x and G_y ；

Dividing all pixel points in the edge map E into left sides according to the direction characteristics GDRight side->And others->Three sets are divided as follows:

belonging to other aspectsThe edge pixel points of the class will be ignored, i.e. if +.>E (x, y) =0; belonging to the left side->Class pixel point preservation as left edge map E ^L I.e. if->Then E ^L (x, y) =255; belonging to right side->Class pixel point preservation as right edge map E ^R I.e. if->Then E ^R (x,y)＝255；

For a plurality of thick linear tracks parallel to each other, dividing the linear tracks into T linear tracks in turn from left to right along the horizontal direction of the image, wherein T represents the number of the linear tracks, and the edge map E is stored as 2T edge maps according to the linear tracks and the edge category to which the edge belongs

The edges of the same category of two adjacent linear tracks are divided into a pair according to the relative position relationOr->The dividing mode is as follows:

Further, the effective pixel point number threshold T is described in steps 4-4 and 4-5 _P Performing self-adaptive adjustment according to the linear track detection result and the search times:

wherein N represents preset searching times and P _x Representing the sum of the statistics of valid edge pixels within the search window and

wherein ,representing statistical histogramsAnd the corresponding statistic value of the middle coordinate i, and x represents the abscissa value of the initial base point.

Further, the step S5 specifically includes the following steps:

step 5-1: the number of the effective windows in the step 4-6 reflects the reliability degree of the current candidate edge detection, and the more the number of the effective windows is, the higher the confidence degree of the candidate edge detection is;

for a thin straight line track, the detection confidence calculating method comprises the following steps:

wherein c_t Indicating the detection confidence coefficient corresponding to the t th line track edge meeting the requirements, n _t The number of effective windows of the edge is represented, N represents preset searching times, beta represents a correction coefficient, and T represents the total number of linear tracks meeting the requirement;

for a thick straight line track, the detection confidence calculating method comprises the following steps:

wherein and />The detection confidence degrees corresponding to the left edge and the right edge of the t th line track meeting the requirements are respectively shown, and the detection confidence degrees are +.> and />The number of effective windows respectively representing the left and right edges of the T th satisfactory straight line track, N represents the preset search times, beta represents the correction coefficient, and T represents the satisfactory straight line Total number of trajectories;

step 5-2: for a thin linear track, two intersection points are formed between the straight line where the edge of each linear track meets the requirements and the upper and lower boundaries of the region of interest, and the corresponding confidence degrees are respectively as follows and />And->Then construct a correction coefficient matrix according to the confidence level>

For the thick linear track, the straight line where the left and right edges of each linear track meeting the requirements are located forms four intersection points with the upper and lower boundaries of the region of interest, and the confidence degrees corresponding to the intersection points are respectively as follows and />And-> Constructing a correction coefficient matrix according to the confidence coefficient

Step 5-3: calculating a corrected measurement error covariance matrix R'

R'＝diag(C _k )R

Wherein diag (·) represents diagonalization operations and R represents initial measurement error co-ordinatesDifference matrix, r=s _R ×U，s _R Representing a scaling coefficient, and U represents a unit array;

step 5-4: corrected Kalman gain K _k Is that

K _k ＝P _k M ^T (MP _k M ^T +R') ^-1

wherein P_k Covariance matrix representing prior estimation error of kth frame image, initialized to P ₀ =u, M represents the measurement matrix, for a thin straight line trajectory,for thick straight line tracks +.>Wherein element M of the ith row, jth column _ij As follows->

Step 5-5: covariance matrix P of the prior estimation error described in step 5-4 _k The calculation method of (1) is that

Where Q represents the covariance matrix of the process noise, q=s _Q ×U，s _Q Represents the scaling factor, superscript T represents the transpose of the matrix, a represents the transition state matrix,

for a thin straight line trajectory:

for a thick straight line trajectory:

covariance matrix representing posterior estimation error of k-1 frame, which is updated as follows

Wherein U represents an identity matrix;

step 5-6: for a thin linear track, the straight line where the edge of the t linear track meets the requirements and the upper and lower boundaries of the region of interest form 2 intersection points, and the abscissa of the intersection points are respectively and />Constructing a kth frame measurement matrix->

For the thick straight line track, 4 intersection points are formed by the straight line of the left and right edges of the t-th straight line track meeting the requirements obtained in the step 4-8 and the upper and lower boundaries of the region of interest, and the abscissa of the intersection points is respectively and />Constructing a kth frame measurement matrix->

Measurement matrix Z of the kth frame _k The calculation method is as follows:

Z _k ＝MX _k

wherein ,X_k Representing a kth frame state estimation matrix;

step 5-7: for a thin straight line trajectory: wherein /> and />The estimated values of the abscissa representing 2 intersection points formed by the straight line where the edge of the t-th straight line track is located and the upper and lower boundaries of the region of interest are estimated by the prediction result of the previous frame of image; />Representing two adjacent frames- >The amount of change between the two,representing two adjacent frames->The amount of change between;

for a thick straight track, the width value of the straight track will be used as an additional constraint for tracking the straight track: wherein /> and />The estimated values of the abscissa representing 4 intersection points formed by the straight line where the left edge and the right edge of the t-th straight line track are positioned and the upper boundary and the lower boundary of the region of interest are estimated by the prediction result of the previous frame of image; />Representing two adjacent frames->The amount of change between->Representing two adjacent frames->The amount of change between->Representing two adjacent frames->The amount of change between->Representing two adjacent frames->The amount of change between; />Represents the width of the top of the straight track, +.>Representing the variation of the width of two adjacent frames, +.>Represents the width of the bottom of the straight track, +.>Representing the variation of the width of two adjacent frames;

step 5-8: k frame state estimation value matrix X _k The calculation method is as follows:

wherein The posterior state estimation matrix representing the k-1 frame is calculated as follows:

initial value for thin straight trackThe detection result of the abscissa of the edge endpoint of the linear track of the first frame image; for thick straight-line tracks, initial value +.>The method comprises the steps of detecting the abscissa of the edge endpoint of the linear track and the width of the linear track of a first frame image;

Step 5-9: the final output result of the linear track tracking is thatThe calculation method is as follows:

the beneficial effects are that: the gradient feature extraction method based on image enhancement is adopted, so that the robustness of linear track edge detection to illumination and weather changes can be enhanced, and the adaptability to complex environments can be improved; the linear track detection method based on the sliding window can effectively solve the problems of mark line interruption, pollution, abrasion and the like; according to the linear track detection result, the Kalman gain is adaptively adjusted, and the stability of lane line tracking is enhanced. The linear track detection method provided by the invention does not need to rely on data set training, has higher detection precision and robustness, and can realize detection and tracking of single or multiple parallel thin or thick continuous or discontinuous linear tracks under complex environments and working conditions.

Drawings

FIG. 1 is a schematic view of the installation of a monitoring camera on a port tire gantry crane;

fig. 2 is an image of a straight track (lane line) photographed by the monitoring camera;

FIG. 3 is a flow chart in accordance with the present invention;

FIG. 4 is a graph of edge detection results;

FIG. 5 is a right side edge view of a second straight line trace after an inverse perspective transformation;

FIG. 6 is a schematic diagram of a method for selecting source point coordinates based on vanishing points;

FIG. 7 is a statistical histogram corresponding to the right edge of the filtered second straight line trace;

FIG. 8 is a statistical histogram corresponding to the right edge of the second straight trace;

FIG. 9 is a schematic diagram of a method for detecting a straight line trajectory based on a sliding window method;

FIG. 10 is a schematic diagram of a mobile robot configuration;

fig. 11 is a weld image taken by an industrial camera.

Detailed Description

To fully illustrate the detection and tracking process of the present invention under different visual acquisition device deployment modes and different linear track forms, two specific embodiments are described: the present invention is further described in detail below with reference to the accompanying drawings, for port tire gantry lane line detection and mobile robot weld tracking.

Inventive example 1:

as shown in fig. 1, the visual-based linear track detection and tracking method of the present invention is based on the hardware basis in the lane line detection embodiment of the port tire type gantry crane: the upright post 2 of the port tire type portal crane is positioned in the middle of the two linear tracks 3, the visual image acquisition equipment 1 is arranged on the upright post 2 of the port tire type portal crane, and the optical axis of the equipment 1 forms an included angle of 45 degrees with the upright post and inclines towards the ground.

Based on the hardware system, the visual linear track detection and tracking method of the embodiment comprises the following steps:

step 1: the visual image acquisition device shoots a linear track image;

step 1-1: the visual image acquisition device comprises, but is not limited to, a monitoring camera, an industrial camera and a network camera, wherein the monitoring camera is adopted in the embodiment;

step 1-2: the monitoring camera is deployed on the upright post of the harbor tire type portal crane so as to ensure that the linear track to be detected and tracked is in the visual field range of the monitoring camera, and the optical axis of the monitoring camera and the upright post form an included angle of gamma=45 degrees and incline to the ground;

step 1-3: an image of the straight line trajectory is photographed using a monitoring camera, as shown in fig. 2. In the embodiment of the invention, the linear track to be detected and tracked is two lane lines, the lane lines are two parallel yellow lines, the width is 12cm, and the distance between the edges of the inner sides of the lane lines is 60cm; the width of each lane line in the image horizontal direction is measured to be 25 pixels, which is greater than the threshold value w=15 pixels set in the embodiment of the present invention, so that the straight-line track belongs to a plurality of thick straight-line tracks parallel to each other.

A flow chart of the method according to the invention is shown in fig. 3, comprising the following steps:

Step 2: enhancing the linear track image shot in the step 1, extracting gradient characteristic information, and identifying the edge of each linear track according to gradient amplitude information to obtain an edge map;

step 2-2: the image in the region of interest is subjected to graying processing, and the graying processing method adopted in this embodiment is as follows:

I(x,y)＝λ ₁ R(x,y)+λ ₂ G(x,y)+λ ₃ B(x,y)

wherein the weight vector λ= [ λ ] ₁ ,λ ₂ ,λ ₃ ]＝[0.3,0.3,-0.3]R (x, y), G (x, y), B (x, y) respectively represent R, G, B color components corresponding to pixels with coordinates (x, y), and I (x, y) represents the converted gray value;

step 2-3: the image enhancement is performed on the image subjected to the graying processing, the image enhancement processing is performed by adopting a self-adaptive histogram equalization algorithm (CLAHE) for limiting the contrast, the histogram equalization is performed on each region by dividing the whole image into a plurality of regions, and the fusion is performed by utilizing a bilinear interpolation mode, so that the contrast of the image can be effectively improved.

Step 2-4: respectively extracting gradient information G in the horizontal direction and the vertical direction of the enhanced image by utilizing Sobel operator _x and G_y ：

step 2-5: the gradient magnitude GM for each pixel is calculated according to:

GM(x,y)＝|G _x (x,y)|

Step 2-6: setting a gradient amplitude threshold T _M The embodiment of the invention takes T _M =20, gradient magnitude GM in region of interest image is equal to or greater than threshold T _M The pixel points of (a) are edge pixel points, and an edge map E is obtained, as shown in fig. 4:

for a thick linear track, according to gradient direction information, the edges of the linear track are divided into a left side, a right side and other three types, and the specific steps are as follows:

in the embodiment of the invention, getBelonging to the other->The edge pixel points of the class will be ignored, i.e. if +.>E (x, y) =0; belonging to the left side->Class pixel point preservation as left edge map E ^L I.e. if->Then E ^L (x, y) =255; belonging to right side->Class pixel point preservation as right edge map E ^R I.e. if->Then E ^R (x,y)＝255；

For a plurality of thick linear tracks parallel to each other, the linear tracks are divided into T linear tracks sequentially from left to right along the horizontal direction of the image, T represents the number of the linear tracks, in the embodiment of the present invention, t=2, and the edge map E is stored as 4 edge maps according to the linear tracks and the edge categories to which the edges belong

step 3: eliminating perspective effect generated in the image shooting process by adopting an inverse perspective transformation method on the edge map obtained in the step 2, and obtaining a transformed edge map;

step 3-1: the inverse perspective transformation process of the edge map obtained in the step 2 is as follows:

wherein A_3×3 Representing an inverse perspective transformation matrix [ u v 1 ]] ^T Representing the coordinates of the pixel points in the original image, called the source point coordinates, [ x y 1 ]] ^T Representing transformed pixel point coordinates, called target point coordinates, and obtaining at least three sets of non-collinear corresponding source point and target point coordinates to solve the inverse perspective transformation matrix A _3×3 ；

Step 3-2: during initialization, for a plurality of thick straight-line tracks parallel to each other, selecting intersection points of the left side edge of a first straight-line track from left and the right side edge of a last straight-line track with the upper and lower boundaries of the region of interest in an image in a manual selection mode as four source points;

Step 3-4: obtaining an inverse perspective transformation matrix A in the step 3-1 according to the coordinates of the source point and the target point obtained in the steps 3-2 and 3-3 _3×3 Combining the homogeneous coordinates of all pixel points in the edge map obtained in the step S2 with A _3×3 Multiplying to obtain a transformed edge map; taking the right edge of the second straight line track as an example, the transformed edge graph is shown as the figure5, the abscissa and ordinate axes in the figure represent the abscissa and ordinate axes of the image, respectively;

step 3-5: for an image containing a plurality of mutually parallel linear tracks, the source point coordinates of the current frame are automatically selected according to the linear track detection result of the previous frame image, so that the self-adaptive adjustment of the inverse perspective transformation matrix is realized; as shown in fig. 6, under the influence of perspective effect, the straight-line tracks originally kept in parallel relationship intersect at vanishing points (five-pointed star marks in the figure) in the image; calculating to obtain intersection points, namely vanishing point coordinates, by using a linear equation of the edge of the linear track detected in the previous frame of image through a least square method; then, for the thick linear track, connecting the vanishing point with the lower end points of the left side edge of the first linear track from the left and the right side edge of the last linear track respectively; the intersection point (cross mark in the figure) of the two connecting lines and the upper and lower boundaries of the region of interest (rectangular frame in the figure) is a group of source points of the current frame image.

Step 4: detecting the converted linear track edges obtained in the step S3 by adopting a sliding window method, screening out the linear track edges meeting the requirements, and finally fitting to obtain an equation of the line where the linear track edges meeting the requirements are located;

step 4-1: edge map after reverse perspective transformationRespectively projecting to the transverse axes of the images to obtain corresponding statistical histograms; taking the right edge of the second straight line track as an example, the statistical histogram is shown in fig. 7, in which the abscissa axis represents the abscissa of the image, and the ordinate axis represents the statistical value of the number of edge pixels, and the unit is: a plurality of;

step 4-2: average filtering is carried out on the statistical histogram in a one-dimensional convolution mode, noise in data is eliminated, and only T with the statistical value exceeding the maximum value of the statistical result is reserved _E % point, T is taken in the examples of the present invention _E Taking the right edge of the second straight line trace as an example,% = 25%, the filtered histogram is shown in fig. 8, where the abscissa axis represents the abscissa of the image and the ordinate axis represents the edgeStatistics of edge pixel number, unit: a plurality of;

Step 4-4: and searching along the longitudinal axis direction of the image by adopting a fixed window size h multiplied by w and a step size s by taking an initial base point as a starting point, wherein h is the height of the window, and w is the width of the window, and the value is equal to the width threshold value of the division of the thick straight line and the thin straight line. In the embodiment of the present invention, w=15 (pixels), and the window height h is equal to the step size s, and the calculation method is as follows

Where H represents the height of the region of interest, N represents the preset number of searches, and in the embodiment of the present invention, h=344 (pixels) is taken, and n=10;

if the number of edge pixels in the window exceeds a set threshold T _P And recognizing the window as an effective window, regarding the edge pixel points in the window as effective edge pixel points, and calculating the average value of the abscissa coordinates of all the edge pixel points in the window so as to update the base point coordinates of the search window. Effective pixel number threshold T _P The self-adaptive adjustment can be carried out according to the detection result of the linear track and the searching times:

wherein

wherein ,representing a statistic value corresponding to a coordinate i in the statistic histogram, wherein x represents an abscissa value of an initial base point;

step 4-5: if the number of edge pixels in the window does not exceed the set threshold T _P The window is identified as an invalid window, the coordinates of the base point are kept unchanged, searching is continued along the longitudinal axis direction until the preset searching times N are reached, as shown in fig. 9, and the horizontal and vertical axes in the figure respectively represent the horizontal and vertical coordinates of the image;

Step 4-6: for each linear track edge, if its corresponding number of active windows is greater than a threshold T _W The embodiment of the invention takes T _W =4, then the straight track edge is considered as a candidate edge; for a plurality of thick linear tracks parallel to each other, after the edges of the linear tracks are matched, the candidate edges meeting the requirements are screened out by combining the distance constraint between the same pair of candidate edges, and the method specifically comprises the following steps of 4-7:

step 4-7: the candidate edges described in steps 4-6 may have false detection results, and if the distance between the plurality of straight-line tracks is fixed, the false detected candidate edges may be screened out by the distance constraint between the candidate edges. The initial base point coordinates of all candidate edges in the four edge graphs form a set respectivelyThe initial base point coordinates of the candidate edge must meet the following condition:

wherein and />Respectively the fixed distance between the left side edge and the right side edge of the first pair of straight lines, T ₁ ^L and T₁ ^R The distance deviation threshold values between the left side edge and the right side edge of the first pair of linear tracks are respectively; the embodiment of the invention adopts->For the initial set of base coordinates satisfying the above condition +.>Effective initial base point coordinates are

The corresponding candidate edges are the linear track edges meeting the requirements;

Step 5: and tracking each linear track meeting the requirements detected in the continuously shot images by adopting a Kalman filter, and adaptively adjusting the Kalman gain according to the detection confidence of the edge of the linear track.

Step 5-1: the number of effective windows described in step 4-6 reflects the reliability of the current candidate edge detection, and the greater the number of effective windows, the higher the confidence of the candidate edge detection. For a thick straight line track, the detection confidence calculating method comprises the following steps:

wherein and />The detection confidence degrees corresponding to the left edge and the right edge of the t th line track meeting the requirements are respectively shown, and the detection confidence degrees are +.> and />Respectively representing the number of effective windows at the left and right edges of the t th line track meeting the requirements, wherein N represents preset searching times, and beta represents a correction coefficient;

step 5-2: for the thick linear track, the straight line where the left and right edges of each linear track meeting the requirements are located forms four intersection points with the upper and lower boundaries of the region of interest, and the confidence degrees corresponding to the intersection points are respectively as follows and />And-> Then construct a correction coefficient matrix according to the confidence level>

Step 5-3: calculating a corrected measurement error covariance matrix R'

R'＝diag(C _k )R

Wherein diag (·) represents the diagonalization operation and R represents the initialMeasurement error covariance matrix, r=s _R ×U，s _R Representing scaling factors, embodiments of the present invention take s _R =500, u represents a unit array;

step 5-4: the corrected Kalman gain is

K _k ＝P _k M ^T (MP _k M ^T +R') ^-1

wherein P_k Covariance matrix representing prior estimation error of kth frame image, initialized to P ₀ ＝U，Representing a measurement matrix in which the elements M of the ith row, jth column _ij The following are listed below

Where Q represents the covariance matrix of the process noise, q=s _Q ×U，s _Q Representing scaling factors, embodiments of the present invention take s _Q =0.1. A represents a transition state matrix

Wherein U represents an identity matrix;

step 5-6: for the thick straight line track, 4 intersection points are formed by the straight line of the left and right edges of the t-th straight line track meeting the requirements obtained in the step 4-8 and the upper and lower boundaries of the region of interest, and the abscissa of the intersection points is respectively and />Constructing a kth frame measurement matrix->

K-th frame measurement matrix Z _k The calculation method is as follows:

Z _k ＝MX _k ；

step 5-7: for a thick straight track, the width value of the straight track will be used as an additional constraint for tracking the straight track: wherein /> and />The estimated values of the abscissa representing 4 intersection points formed by the straight line where the left edge and the right edge of the t-th straight line track are positioned and the upper boundary and the lower boundary of the region of interest are estimated by the prediction result of the previous frame of image; />Representing two adjacent frames->The amount of change between->Representing two adjacent frames->The amount of change between->Representing two adjacent frames->The amount of change between->Representing two adjacent frames->The amount of change between; />Represents the width of the top of the straight track, +.>Representing the variation of the width of two adjacent frames, +.>Represents the width of the bottom of the straight track, +.>Representing the variation of the width of two adjacent frames;

the posterior state estimation matrix representing the K-1 frame is calculated as follows:

initial valueThe method comprises the steps of detecting the abscissa of the edge endpoint of the linear track and the width of the linear track of a first frame image;

inventive example 2:

As shown in fig. 10, the visual-based linear track detecting and tracking method of the present invention is based on the hardware basis: the visual image acquisition device 1 is arranged on a cross beam of the mobile robot platform 2, the optical axis of the visual image acquisition device 1 is perpendicular to the ground, and the linear track 3 is positioned in the center of the visual field of the visual image acquisition device 1.

step 1: the visual image acquisition device shoots a linear track image;

step 1-1: the visual image acquisition equipment comprises, but is not limited to, a monitoring camera, an industrial camera and a network camera, wherein the industrial camera is adopted in the embodiment of the invention;

step 1-2: the industrial camera is deployed on a beam of the mobile robot so as to ensure that a linear track to be detected and tracked is in the visual field range of the industrial camera, and the optical axis of the industrial camera is perpendicular to the ground; an image of the straight line trajectory is photographed using an industrial camera, as shown in fig. 11. In the embodiment of the invention, the linear track to be detected and tracked is a single welding line, the width of the welding line is 0.1-0.2mm, the width of the welding line in the horizontal direction of the image is 10 pixels, which is smaller than the threshold value w=60 (pixels), so that the linear track belongs to a single thin linear track.

step 2: enhancing the linear track image shot in the step S1, extracting gradient characteristic information, and identifying the edge of each linear track according to gradient amplitude information to obtain an edge map;

I(x,y)＝λ ₁ R(x,y)+λ ₂ G(x,y)+λ ₃ B(x,y)

Respectively extracting gradient information G in the horizontal direction and the vertical direction of the enhanced image by utilizing Sobel operator _x and G_y ：

step 2-5: the gradient magnitude GM for each pixel is calculated according to:

GM(x,y)＝|G _x (x,y)|

step 3: because the optical axis of the camera is perpendicular to the ground, the plane of the welding line is parallel to the imaging plane of the camera, and is not influenced by perspective effect, and inverse perspective transformation is not needed;

step 4-1: projecting the edge map E to the horizontal axis direction of the image to obtain a corresponding statistical histogram;

step 4-2: average filtering is carried out on the statistical histogram in a one-dimensional convolution mode, noise in data is eliminated, and only T with the statistical value exceeding the maximum value of the statistical result is reserved _E % point, T is taken in the examples of the present invention _E ％＝25％；

step 4-4: and searching along the longitudinal axis direction of the image by adopting a fixed window size h multiplied by w and a step size s by taking an initial base point as a starting point, wherein h is the height of the window, and w is the width of the window, and the value is equal to the width threshold value of the division of the thick straight line and the thin straight line. In the embodiment of the present invention, the width w=60 (pixels) of the search window, and the height h of the window is equal to the step s, and the calculation method is as follows

Where H represents the height of the region of interest, N represents the preset number of searches, and in the embodiment of the present invention, h=2592 (pixels) is taken, and n=10;

if the number of edge pixels in the window exceeds a set threshold T _P The window is considered to be an effective window, and the edge pixels inside the window are considered to be effective edge pixels. Calculating the average value of the abscissa of all edge pixel points in the window, so as to update the base point coordinates of the search window; effective pixel number threshold T _P The self-adaptive adjustment can be carried out according to the detection result of the linear track and the searching times:

wherein

step 4-6: for each linear track edge, if its corresponding number of active windows is greater than a threshold T _W The invention is thatExample T is taken _W =4, then the linear track edge is considered as a candidate linear track edge; for a single thin linear track, the candidate edge is the linear track edge meeting the requirements;

Step 5: tracking a linear track in the continuous images by adopting a Kalman filter, and adjusting Kalman gain according to the detection confidence of the edge of the linear track;

step 5-1: the number of effective windows described in step 4-6 reflects the reliability of the current candidate edge detection, and the greater the number of effective windows, the higher the confidence of the candidate edge detection. For a thin straight line track, the detection confidence calculating method comprises the following steps:

wherein c₁ Representing the detection confidence coefficient corresponding to the linear track edge meeting the requirements, n ₁ The number of effective windows representing the edge, N represents the preset search times, and β represents the correction coefficient, and in the embodiment of the present invention, β=100;

step 5-2: for a thin linear track, two intersection points are formed between the straight line where the edge of the linear track meets the requirement and the upper and lower boundaries of the region of interest, and the corresponding confidence degrees are respectively as follows and />And->Then construct a correction coefficient matrix according to the confidence level>

Step 5-3: calculating a corrected measurement error covariance matrix R'

R'＝diag(C _k )R

Where diag (·) represents the diagonalization operation, R represents the initial measurement error covariance matrix, r=s _R ×U，s _R Representing scaling factors, embodiments of the present invention take s _R =500, u represents a unit array;

step 5-4: the corrected Kalman gain is

K _k ＝P _k M ^T (MP _k M ^T +R') ^-1

wherein P_k Covariance matrix representing prior estimation error of kth frame image, initialized to P ₀ ＝U，Representing a measurement matrix

Covariance matrix representing posterior estimation error of k-1 th frame, which The update procedure is as follows

Wherein U represents an identity matrix;

step 5-6: for a fine linear track, the straight line where the edge of the linear track meeting the requirements obtained in the step 4-8 is positioned and the upper and lower boundaries of the region of interest form 2 intersection points, and the abscissa of the intersection points is respectively and />Forming a matrix of k-th frame measurementsK-th frame measurement matrix Z _k The calculation method is as follows:

Z _k ＝MX _k

wherein ,X_k Representing a kth frame state estimation matrix;

step 5-7: for a thin straight line trajectory: wherein /> and />The estimated values of the abscissa representing 2 intersection points formed by the straight line where the straight line track edge is located and the upper and lower boundaries of the region of interest are estimated by the prediction result of the previous frame of image; />Representing two adjacent frames->The amount of change between->Representing two adjacent frames->The amount of change between;

initial value for thin straight trackThe detection result of the abscissa of the edge endpoint of the linear track of the first frame image;

finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application and not for limiting the scope of protection thereof, and although the present application has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: various changes, modifications, or equivalents may be made to the particular embodiments of the application by those skilled in the art after reading the present disclosure, but such changes, modifications, or equivalents are within the scope of the pending claims.

Claims

1. The visual-based linear track detection and tracking method is characterized in that the method is used for detecting and tracking single or a plurality of thin or thick continuous or discontinuous linear tracks which are parallel to each other, wherein each linear track is regarded as a thick linear track if the width of the linear track in the horizontal direction of an image is larger than a threshold value w, and is regarded as a thin linear track if the width of the linear track in the horizontal direction of the image is larger than the threshold value w; the method comprises the following steps:

step S1: the visual image acquisition device shoots a linear track image;

2. The vision-based linear trajectory detection and tracking method of claim 1, wherein: the step S1 specifically comprises the following steps:

3. The vision-based linear trajectory detection and tracking method of claim 1, wherein: the step S2 specifically comprises the following steps:

step 2-5: the gradient magnitude GM for each pixel is calculated according to:

GM(x,y)＝|G _x (x,y)|

4. the vision-based linear trajectory detection and tracking method of claim 1, wherein: the step S3 specifically comprises the following steps:

5. The vision-based linear trajectory detection and tracking method of claim 1, wherein: the step S4 specifically comprises the following steps:

step 4-4: searching along the longitudinal axis direction of the image by adopting a fixed window size h multiplied by w and a fixed step size s by taking an initial base point as a starting point, wherein h is the height of the window, w is the width of the window and is equal to the threshold value of the width divided by the thick straight line and the thin straight line in value, and if the number of edge pixel points in the window exceeds a set threshold value T _P The window is considered as an effective window, the edge pixel points in the window are regarded as effective edge pixel points, and the average value of the abscissa coordinates of all the edge pixel points in the window is calculated, so that the base point coordinates of the search window are updated;

step 4-7: for a plurality of thin straight-line tracks parallel to each other, the t-th pair of edge maps Gr _t ＝<E _t ,E _t+1 >All of (3)Respectively B is a set of initial base point coordinates of candidate edges of (a) _t ,B _t+1 T=1, 2,..t-1, then the initial base point coordinates of the candidate edge must meet the following condition:

wherein D_t For a fixed distance between the T-th pair of linear track edges, T _t A distance deviation threshold value between the edges of the t-th pair of straight-line tracks; for the initial base point coordinate set B 'satisfying the above condition' _t ,B' _t+1 T=1, 2,..t-1, effective initial base point coordinates are

for a plurality of thick straight-line tracks parallel to each other, the t-th pair of edge maps and />The set of initial base coordinates of all candidate edges in (a) is respectively> The initial base point coordinates of the candidate edge must meet the following condition:

wherein and />Respectively the fixed distance between the left side edge and the right side edge of the T-th pair of linear tracks, T _t ^L and T_t ^R Respectively the distance deviation threshold value between the left side edge and the right side edge of the t-th pair of linear tracks; for the initial set of base coordinates satisfying the above condition +.> Effective initial base point coordinates are

6. The vision-based linear trajectory detection and tracking method of claim 5, wherein: the specific method for pairing the linear track edges is as follows:

Gr _t ＝<E _t ,E _t+1 >,t＝1,2,...,T-1

The edges of the same category of two adjacent linear tracks are divided into a pair of Gr according to the relative position relation _t ^L Or Gr _t ^R The dividing mode is as follows:

7. the vision-based linear trajectory detection and tracking method of claim 5, wherein: the threshold T of the number of the effective pixel points in the steps 4-4 and 4-5 _P Performing self-adaptive adjustment according to the linear track detection result and the search times:

wherein ,and the statistical value corresponding to the coordinate i in the statistical histogram is represented, and x represents the abscissa value of the initial base point.

8. The vision-based linear trajectory detection and tracking method of claim 6, wherein: the step S5 specifically comprises the following steps:

wherein and />The detection confidence degrees corresponding to the left edge and the right edge of the t th line track meeting the requirements are respectively shown, and the detection confidence degrees are +.> and />The number of effective windows at the left and right edges of the T-th line track meeting the requirements is respectively represented, N represents preset searching times, beta represents a correction coefficient, and T represents the total number of the line tracks meeting the requirements;

step 5-2: for thin straight-line railsTrace, each straight line which meets the requirement and the upper and lower boundaries of the region of interest form two intersection points, the confidence degrees of which are respectively and />And->Then construct a correction coefficient matrix according to the confidence level>

Step 5-3: calculating a corrected measurement error covariance matrix R'

R'＝diag(C _k )R

Wherein diag (·) represents a diagonalization operation, R representsInitial measurement error covariance matrix, r=s _R ×U，s _R Representing a scaling coefficient, and U represents a unit array;

step 5-4: corrected Kalman gain K _k Is that

K _k ＝P _k M ^T (MP _k M ^T +R') ^-1

for a thin straight line trajectory:

for a thick straight line trajectory:

Wherein U represents an identity matrix;

Measurement matrix Z of the kth frame _k Calculation formulaThe formula is as follows:

Z _k ＝MX _k

wherein ,X_k Representing a kth frame state estimation matrix;

Step 5-7: for a thin straight line trajectory: wherein /> and />The estimated values of the abscissa representing 2 intersection points formed by the straight line where the edge of the t-th straight line track is located and the upper and lower boundaries of the region of interest are estimated by the prediction result of the previous frame of image; />Representing two adjacent frames->The amount of change between the two,representing two adjacent frames->The amount of change between;

for a thick straight track, the width value of the straight track will be used as an additional constraint for tracking the straight track: wherein and />The estimated values of the abscissa representing 4 intersection points formed by the straight line where the left edge and the right edge of the t-th straight line track are positioned and the upper boundary and the lower boundary of the region of interest are estimated by the prediction result of the previous frame of image;representing two adjacent frames->The amount of change between->Representing two adjacent frames->The amount of change between->Representing two adjacent frames->The amount of change between->Representing two adjacent frames->The amount of change between; />Represents the width of the top of the straight track, +.>Representing the variation of the width of two adjacent frames, +.>Represents the width of the bottom of the straight track, +.>Representing the variation of the width of two adjacent frames;