CN113822912B

CN113822912B - Scale self-adaptive tracking method and device for image target length-width ratio change

Info

Publication number: CN113822912B
Application number: CN202111171804.4A
Authority: CN
Inventors: 苏昂; 陆伟康; 尚洋; 张文龙; 李璋
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2021-10-08
Filing date: 2021-10-08
Publication date: 2022-09-16
Anticipated expiration: 2041-10-08
Also published as: CN113822912A

Abstract

The application relates to a scale self-adaptive tracking method and device for image target length-width ratio change. The method comprises the following steps: two filters which independently estimate the length and the width dimensions are added on the basis of the position filter so as to finish the self-adaptive tracking when the appearance of the target is changed obviously. The method further expands the application range of the target tracking method, can be used for tracking the ground moving target, and has important theoretical research significance and wide application prospect for tracking the target of the fast moving aircraft.

Description

Scale self-adaptive tracking method and device for image target length-width ratio change

Technical Field

The application relates to the technical field of target tracking, in particular to a scale self-adaptive tracking method and device for image target length-width ratio change.

Background

In the conventional target tracking method, a target tracking method represented by correlation-based filtering obtains extensive attention of researchers at home and abroad by virtue of high efficiency and high accuracy. The related filtering method greatly reduces the amount of calculation by shifting the image matrix operation in the spatial domain into the dot product operation in the frequency domain by means of Fourier transform.

Henriques et al introduce kernel functions and HOG features into the correlation filtering, and propose a kernel correlation filter, which greatly improves the accuracy of the algorithm, but cannot solve the problem of target scale variation. In order to solve the problem of scale change, Martin et al propose a DSST algorithm, and an author separately calculates position estimation and scale estimation, additionally trains a scale estimation filter on the basis of the filter of the position estimation, updates the target position firstly, and then estimates the target scale, so that the tracker has the scale self-adaptive capability. However, the DSST algorithm cannot change the aspect ratio of the target frame, and is only applicable to a case where the distance between the target and the observation camera changes, and when the posture of the target changes, the aspect ratio of the target changes accordingly. At this time, it is obvious that the initially given target frame cannot accurately reflect the real-time aspect ratio of the target, and often a part of the target exceeds the target frame, and only local feature information of the target can be obtained, or a large amount of background information exists in the target frame, so that excessive background information is learned, and the tracking drift or even failure is caused.

Disclosure of Invention

In view of the above, it is necessary to provide a scale adaptive tracking method and apparatus for image target aspect ratio change, which can improve the stability and the comprehensive performance of the tracker.

A method for scale-adaptive tracking of changes in aspect ratio of an image target, the method comprising:

acquiring a target data set to be tracked, wherein the target data set comprises a plurality of frames of target images which are arranged in a time sequence;

when a plurality of frame target images behind a second frame target image are subjected to target tracking, length space characteristics and width space characteristics are extracted from a current frame image according to an estimated target frame obtained from a previous frame target image, a current frame candidate area is determined according to estimated center position coordinates of a target in the previous frame target image, and position characteristics are extracted from the current frame target image according to the current frame candidate area;

performing correlation operation on the position characteristics and an updated position filter obtained from a previous frame of target image to obtain an estimated central position coordinate of a target in a current frame of image;

performing correlation operation on the length space characteristic and the width space characteristic and an updated length filter and an updated width filter obtained from a previous frame of target image respectively to obtain an estimated target frame of a target in a current frame of target image;

determining a new candidate area according to the estimated central position coordinates of the target in the current frame target image, constructing and training according to the position features extracted from the current frame target image by the new candidate area to obtain a new position filter, and constructing and training according to the length space features and the width space features extracted from the current frame target image by the estimated target frame of the target in the current frame target image to obtain a new length filter and a new width filter;

and respectively carrying out iterative updating on the updated position filter, the updated length direction filter and the updated width filter obtained from the previous frame of target image according to the new position filter, the new length filter and the new width filter obtained from the current frame of target image to correspondingly obtain the updated position filter, the updated length direction filter and the updated width filter applied to the next frame of image.

In one embodiment, in between target tracking on multiple frames of target images subsequent to a second frame of target image, processing a first frame of target image in the target data set further includes:

determining an initial target frame in the first frame target image, and determining a first frame candidate region according to the position of the central point of the initial target frame;

constructing and training an initial position filter according to the position characteristics of the first frame candidate region extracted from the first frame target image;

and respectively constructing and training according to the length space characteristic and the width space characteristic of the initial target frame extracted from the first frame target image to obtain an initial length direction filter and an initial width direction filter.

In one embodiment, processing the second frame target image after processing the first frame target image includes:

extracting position characteristics in a second frame target image according to the first frame candidate region, and performing correlation operation on the position characteristics and an initial position filter to obtain an estimated central position coordinate of a target in the second frame target image;

extracting length space features and width space features from a second frame target image according to the initial target frame, and performing correlation operation on the length space features and the width space features with an initial length filter and an initial width filter respectively to obtain an estimated target frame of a target in the second frame target image;

determining a new candidate region according to the estimated central position coordinates of the target in the second frame of target image, obtaining a new position filter according to the new candidate region, and obtaining a new length filter and a new width filter according to the estimated target frame of the target in the second frame of target image;

and respectively carrying out iterative updating on the initial position filter, the initial length direction filter and the initial width filter obtained from the first frame of target image according to the new position filter, the new length filter and the new width filter obtained from the second frame of target image to correspondingly obtain an updated position filter, an updated length direction filter and an updated width filter applied to the next frame of image.

In one embodiment, when a first frame candidate area is determined according to an initial target in a first frame target image, if the size of the target frame is larger than ten thousand pixels, the image area where the first frame candidate area is located is downsampled, so that the size of the target frame is smaller than ten thousand pixels.

In one embodiment, when each frame of target image is processed, the length space feature and the width space feature extracted according to the target frame and the position feature extracted according to the candidate region are fusion features of both the histogram of oriented gradients feature and the image gray feature.

In one embodiment, when each frame of target image is processed, constructing and training a position filter according to the position features includes:

and constructing a two-dimensional Gaussian distribution response map according to the position characteristics, and then obtaining a position filter according to the position characteristics and the two-dimensional Gaussian distribution response map.

In one embodiment, when each frame of target image is processed, constructing and training a position filter according to the position features further includes:

generating a plurality of training samples in an image area where a current frame target image is located in a candidate area according to a cyclic shift method, and extracting the position characteristics of each training sample;

and constructing and training a position filter according to the plurality of position features.

The application also provides a scale adaptive tracking device for the change of the aspect ratio of an image target, which comprises:

the target data set acquisition module is used for acquiring a target data set to be tracked, wherein the target data set comprises a plurality of frames of target images which are arranged in a time sequence;

the characteristic extraction module is used for extracting length space characteristics and width space characteristics in a current frame image according to an estimated target frame obtained from a previous frame target image when a target tracking is carried out on a plurality of frame target images behind a second frame target image, determining a current frame candidate area according to the estimated center position coordinates of a target in the previous frame target image, and extracting position characteristics in the current frame target image according to the current frame candidate area;

the position estimation module is used for carrying out correlation operation on the position characteristics and an updated position filter obtained from a previous frame of target image to obtain an estimated central position coordinate of a target in a current frame image;

the target frame estimation module is used for respectively carrying out correlation operation on the length space characteristic and the width space characteristic with an updated length filter and an updated width filter obtained from a previous frame of target image to obtain an estimated target frame of a target in a current frame of target image;

the current frame optimal filter obtaining module is used for determining a new candidate area according to the estimated central position coordinates of the target in the current frame target image, constructing and training according to the position characteristics of the new candidate area extracted from the current frame target image to obtain a new position filter, and constructing and training according to the length space characteristics and the width space characteristics of the estimated target frame of the target in the current frame target image extracted from the current frame target image to obtain a new length filter and a new width filter;

and the filter updating module is used for respectively carrying out iterative updating on the updated position filter, the updated length direction filter and the updated width filter obtained by the previous frame of target image according to the new position filter, the new length filter and the new width filter obtained by the current frame of target image so as to correspondingly obtain the updated position filter, the updated length direction filter and the updated width filter applied to the next frame of image.

A computer device comprising a memory storing a computer program and a processor implementing the above-described method for scale-adaptive tracking of changes in aspect ratio of an image target when the computer program is executed.

A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, implements the above-mentioned method for scale-adaptive tracking of changes in aspect ratios of image objects.

According to the scale self-adaptive tracking method and device for the change of the aspect ratio of the image target, the independent length filter and width filter are provided for estimating the change of the length and width of the target respectively aiming at the problem of the change of the scale of the target in the length and width directions besides the position filter, so that the scale self-adaptive performance is effectively improved, and meanwhile, the accuracy of target tracking is improved.

Drawings

FIG. 1 is a schematic flow chart of a scale-adaptive tracking method for changes in aspect ratio of an image target according to an embodiment;

FIG. 2 is a schematic diagram of a correlation filtering principle in one embodiment;

FIG. 3 is a schematic diagram of the cyclic shift principle in one embodiment;

FIG. 4 is a flowchart illustrating an implementation procedure of a scale-adaptive tracking method for aspect ratio changes of an image target according to another embodiment;

FIG. 5 is a block diagram of an embodiment of a scale-adaptive tracking apparatus for aspect ratio changes of an image target;

FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

When the target is tracked, due to the rapid movement of the tracked target, the background of the tracked target, the relative distance between the tracked target and the background and the relative azimuth angle are changed continuously, so that the display of the target in the image is changed.

Aiming at the problem that the existing method cannot estimate the length-width ratio scale of a target, as shown in fig. 1, the application provides a scale self-adaptive tracking method for the change of the length-width ratio of the target of an image, which comprises the following steps:

step S100, acquiring a target data set to be tracked, wherein the target data set comprises a plurality of frames of target images which are arranged in a time sequence;

step S110, when a plurality of frame target images behind a second frame target image are subjected to target tracking, length space characteristics and width space characteristics are extracted from a current frame image according to an estimated target frame obtained from a previous frame target image, a current frame candidate area is determined according to estimated center position coordinates of a target in the previous frame target image, and position characteristics are extracted from the current frame target image according to the current frame candidate area;

step S120, the position characteristics and an updated position filter obtained from the previous frame of target image are subjected to correlation operation to obtain the estimated central position coordinate of the target in the current frame of image;

step S130, the length space characteristic and the width space characteristic are respectively correlated with an updated length filter and an updated width filter obtained from the previous frame of target image to correspondingly obtain an estimated target frame of the target in the current frame of target image;

step S140, determining a new candidate area according to the estimated central position coordinates of the target in the current frame target image, constructing and training a new position filter according to the position characteristics extracted by the new candidate area in the current frame target image, and constructing and training a new length filter and a new width filter according to the length space characteristics and the width space characteristics extracted by the estimated target frame of the target in the current frame target image;

step S150, the updated position filter, the updated length direction filter, and the updated width filter obtained from the previous frame of target image are respectively updated iteratively according to the new position filter, the new length filter, and the new width filter obtained from the current frame of target image, so as to obtain an updated position filter, an updated length direction filter, and an updated width filter for application to the next frame of target image.

In this embodiment, when tracking a target in each target image of a frame subsequent to the second frame, a target frame and a candidate region in the target image of the current frame are constructed by using a target center position and a target frame detected in the target image of the previous frame, and an estimated center position and an estimated target frame of the target in the target image of the current frame are obtained by using an update filter acquired from the target image of the previous frame, so as to track the target in the target image of the current frame.

In fact, it can also be understood that the center position coordinate obtained by the position filter in the current frame target image is the position of the target in the next frame, then the length filter and the width filter are used to estimate the target scale changes in the X direction and the Y direction respectively to determine the scales of the target in the X direction and the Y direction, and then the scale changes in the length direction and the width direction are transferred to the next frame position filter and the size of the search area is adjusted in time to implement the scale adaptive tracking of the aspect ratio change of the target. Based on the assumption that the scale change of the target in two adjacent frames is often smaller than the position change of the target, the position of the target is determined first, and then the length and width of the target are estimated, so that the position information and the scale information of the target can be obtained more accurately.

Besides, in each frame, the target is tracked, an optimal position filter, a length filter and a width filter are constructed according to the estimated target center position and the target frame scale, and the filter obtained in the previous frame is subjected to iterative updating through the filter obtained in each frame to obtain an updated filter applied to one frame, so that the accuracy in tracking the target in each frame of image can be improved.

In step S100, the acquired target data set to be tracked can be obtained by various ways, such as a target image acquired by a monocular laser speckle projection system, a target image acquired by a visible light camera, or a target image acquired by an infrared camera, that is, the target tracking method in this application is applicable to various fields.

In this embodiment, before performing target tracking on a plurality of frame target images subsequent to the second frame target image, further processing the first frame target image in the target data set, includes: determining an initial target frame in a first frame target image, determining a first frame candidate region according to the central point position of the initial target frame, constructing and training according to the position characteristics of the first frame candidate region extracted from the first frame target image to obtain an initial position filter, and respectively constructing and training according to the length space characteristics and the width space characteristics of the initial target frame extracted from the first frame target image to obtain an initial length direction filter and an initial width direction filter.

The first frame target image is used for constructing an initial filter, specifically:

giving an initial target frame in a first frame of target image through a detection algorithm or an artificial standard, namely the initial target frame is a true value, judging the size of the target frame, if the target frame is larger than ten thousand pixels, performing s-fold down-sampling on an image area where a candidate area is located (namely, reducing the image area to 1/s of an original image) to enable an initial training image block to be smaller than ten thousand pixels, and at the moment, enabling the size of the target frame to be (M is the size of the target frame) _s ，N _s )。

Centered at the initial target frame center, at Z ═ 1+ padding (M) _s *Scale_x,N _s Scale _ y), Z is a candidate region of the first frame target image, and 1+ padding represents the range of the search candidate region of the target, wherein padding is a parameter and can adjust the size of the search region. M is the size of the initial target frame after down sampling in the x direction, N is the size of the initial target frame in the y direction, Scale _ x is the change of the Scale in the x direction, Scale _ y is the change of the Scale in the y direction, and the initial values of Scale _ x and Scale _ y are both 1.

When each frame of target image is processed, the length space characteristic and the width space characteristic extracted according to the target frame and the position characteristic extracted according to the candidate area are fusion characteristics of the histogram of oriented gradient and the image gray level characteristic.

And extracting a position feature in the candidate region of the first frame target image, wherein the extracted feature is a fusion feature of a HOG (Histogram of Oriented gradients) feature and an image gray feature. In the method, the HOG is a 31-dimensional feature, and a total of 32-dimensional features including one-dimensional gray scale features, so that the number of layers of the position filter is also 32, and the size is [ (1+ padding) × (M, N) ] × 32, because in the following tracking process, due to the change of the target dimension, the candidate region of the target can be scaled, so that the candidate region can be matched with the size of the filter to realize better description of the target.

When length space feature extraction and width space feature extraction are carried out according to the initial target frame, down-sampling is carried out on the image area where the target frame is located, the size of the initial target frame is smaller than 512 pixels, and the size of the target is (M) ₅₁₂ ，N ₅₁₂ ) Selecting a scale factor a to be 1.02, constructing a 1S Gaussian distribution correlation response graph g in a scale space series S to be 33, extracting the features of the x-direction scale space, wherein the extracted sample area is Z ₅₁₂ ＝(M ₅₁₂ *Scale_x*a ⁿ ，N ₅₁₂ Scale _ y). Wherein n represents the interval [ - (S-1)/2, (S-1)/2]S candidate regions gradually changing in the x direction can be obtained, and then the HOG and the gray features are extracted. At this time the features of the target are again 2-dimensional at each scale, in order to train a 1 × S one-dimensional filter (but with M) ₅₁₂ *N ₅₁₂ Layer) it needs to be pulled into a one-dimensional vector form. Converting the feature on each scale into a vector form to obtain an S (M) ₅₁₂ *N ₅₁₂ ) Then the length filter is trained. The same principle is used for width space feature extraction in the y direction.

In the present application, the principle of the correlation filtering is utilized to construct and train the position filter, the length filter and the width filter respectively. It should be noted that, in each frame, there is a step of constructing a filter, and the method is the same for constructing a new filter, that is, step S140, in a plurality of frame target images after the second frame target image, and the construction of the initial filter is taken as an example in this application.

The method of deriving the initial position filter from the image features is conceptually based on the correlation filtering. The basic principle of the correlation filtering is as follows: after the feature f of the initial image block calculated by the first frame of the initial target frame is obtained, the filter h is trained, so that the output graph of the correlation response is a gaussian distribution graph g, and the peak value of the output graph is the central point of the image, as shown in fig. 2. Namely:

a Fast Fourier Transform (FFT) is performed on equation (1) to transform into the following form:

thus, a filter template H:

and in the subsequent frame, the correlation response peak value of the search area and the filter is the target center position.

In this embodiment, when processing each frame of target image, constructing and training a position filter according to the position features includes: and constructing a two-dimensional Gaussian distribution response map according to the position characteristics, and then obtaining a position filter according to the position characteristics and the two-dimensional Gaussian distribution response map.

When each frame of target image is processed, the method for constructing and training the position filter according to the position characteristics further comprises the following steps: and generating a plurality of training samples in the image area of the current frame target image of the candidate area according to a cyclic shift method, extracting the position characteristics of each training sample, and constructing and training according to the position characteristics to obtain a position filter.

In particular, position filteringThe kernel correlation filter used by the device introduces kernel functions and multi-channel characteristics on the basis of the original correlation filter, and generates a plurality of candidate region training samples (x) by a cyclic shift method _i ，y _i ) Wherein x is _i Is a sample feature column vector, y _i Is a label scalar. The cyclic shift of the training samples is derived from the permutation matrix, as shown in fig. 3.

In order to prevent overfitting, a regularization penalty term parameter lambda is added, and a loss function is established by a ridge regression method:

loss＝min _w ∑ _i (f(x _i )-y _i ) ² +λ||ω|| (4)

in the formula (4), f (x) _i )＝ω ^T x _i When the derivative of ω is 0, a closed solution can be obtained:

ω＝(X ^T X+λI)- ¹ X ^T Y (5)

in the formula (5), X ^T ＝(x ₁ ，x ₂ ，…，x _n ) The circulant matrix constructed for the sample x features, all the circulant matrices can be diagonalized in fourier space using discrete fourier matrices as can be seen from the theorem:

and (3) converting the nonlinear problem into a linear problem of a high-dimensional space by using a kernel function for the formula (6), and simplifying parameter training to obtain:

in the case of the formula (7),

is a sample x _i Mapping of the low-dimensional space to the high-dimensional space. Thus, the solution of the available classifier is:

α＝(K+λI) ^-1 Y (8)

in formula (8), K is a kernel correlation matrix of the training sample, and it can still be ensured that the kernel correlation matrix is a circulant matrix under the condition of partial kernel function, in the present invention, a gaussian kernel function is selected:

in equation (9), σ is a kernel function parameter, denotes a complex frequency domain conjugate,

is an inverse fourier transform. Therefore, after fourier transformation, by utilizing the property that the circulant matrix can be diagonalized by the fourier transform matrix, it is possible to obtain:

in equation (10), ^ is the result after Fourier transform.

In constructing and training the position filter, a gaussian distribution response map with a size of (M × N) × (1+ padding) is first constructed (since the size of the filter cannot be changed, the size of (M × N) × (1+ padding) is constructed directly), and then the extracted features are substituted into the above formula (4-10) to obtain the position filter.

Performing a correlation operation on the candidate region feature in the next frame of target image by using the position filter, wherein the position where the response value is the largest is the position where the target is located, that is, the center position of the target, that is, step S120, and is represented by the following formula:

f(z)＝K ^z α (11)

in formula (11), K ^z ＝κ(z _i ，x _i ) It is a kernel function of the test sample and all training samples, where the test sample is the target sample of the last frame. z is a radical of _i Is a candidate sample feature vector.

When the length filter and the width filter are constructed and trained, the length filter is constructed and trained, the dimension in the y direction, namely the width direction, is kept unchanged in a target frame, the dimension in the x direction, namely the length direction, is only scaled, S candidate frames with different dimensions are intercepted by taking a target image as a center, and a feature vector f under each dimension is obtained. And constructing a 1S Gaussian distribution response graph G, and substituting the Gaussian distribution response graph G and the feature vectors under all scales into a formula (12) and a formula (13) to obtain the length filter H. When the width filter is constructed and trained, the obtained x-direction scale is substituted into the width filter, and the width filter can be obtained in the same way.

When a length filter and a width filter are constructed, because multi-channel characteristics are introduced, in order to enable all dimension characteristics to be suitable for a characteristic template, a loss function is established by a ridge regression method:

in the formula (12), h ^l For the filter, f is the feature vector, g is the gaussian response function, and l represents the dimension of the feature. H according to the formula (12) ^l Derivative is obtained, and the derivative of loss is obtained to obtain a filter H ^l ：

When the length and width filter is used for detecting the target scale in the current frame target image, firstly, the feature vectors under a plurality of scales in the length direction of the target frame, namely the length space features, are used for carrying out correlation operation with the length filter to obtain the maximum response in the length direction, and the maximum response position is the target length scale in the current frame target image. The target length scale is then substituted into the width filter to obtain the target width scale in the current frame target image, so as to obtain the estimated target frame of the target in the current frame target image, that is, step S130.

In this embodiment, processing the second frame target image after processing the first frame target image includes: extracting position characteristics in a second frame target image according to a first frame candidate area, carrying out correlation operation on the position characteristics and an initial position filter to obtain estimated central position coordinates of a target in the second frame target image, extracting length space characteristics and width space characteristics in the second frame target image according to an initial target frame, carrying out correlation operation on the length space characteristics and the width space characteristics with the initial length filter and the initial width filter respectively to obtain an estimated target frame of the target in the second frame target image, determining a new candidate area according to the estimated central position coordinates of the target in the second frame target image, obtaining a new position filter according to the new candidate area, and obtaining a new length filter and a new width filter according to the estimated target frame of the target in the second frame target image. And respectively carrying out iterative updating on the initial position filter, the initial length direction filter and the initial width filter obtained from the first frame of target image according to the new position filter, the new length filter and the new width filter obtained from the second frame of target image to correspondingly obtain an updated position filter, an updated length direction filter and an updated width filter applied to the next frame of image.

It can be seen from the above process of processing the second frame target image that the only difference between the process of processing the second frame target image and the process of processing the subsequent frame target image is that the position feature of the correlation operation performed with the initial position filter in the second frame target image is obtained according to the first frame candidate region, and in the subsequent multi-frame target image, the position feature is obtained according to the new candidate region constructed according to the estimated target center position obtained after the target position is detected by the previous frame. The length space characteristic and the width space characteristic which are subjected to correlation operation with the initial length filter and the initial width filter in the second frame of target image are obtained according to the initial target frame, and in the subsequent multi-frame target image, the length space characteristic and the width space characteristic are obtained according to an estimated target frame obtained after the target scale is detected in the previous frame.

In a plurality of frames of target images after the first frame of target image, the target position (namely, the central position coordinate of the target) and the target scale (namely, the target frame, the scale in the length direction and the scale in the width direction of the target can be known through the target frame) extracted by the target image detection of the previous frame are used for carrying out correlation operation with the update filter obtained by the previous frame to obtain the target position and the target scale in the target image of the current frame, so as to realize the tracking of the target.

In this embodiment, a new optimal filter for the current frame is further generated according to the target position and the target scale detected by the target image of the current frame, and then the new filter and the updated filter obtained from the previous frame are iterated to obtain an updated filter for the next frame, which is step S150.

Specifically, the position filter is updated as:

H _t ＝(1-η)H _t-1 +ηα (14)

in equation (14), α represents a new position filter obtained by new training of the current frame, H _t-1 Update filter, H, for t-1 frame, i.e. the last frame _t Is the updated position filter obtained for the t frame, i.e., the current frame.

Specifically, the length and width filters are updated as follows:

in the formula (15), the first and second groups,

for the updated numerator of the length and width filter,

representing the newly trained filter numerator of the current frame,

is the update filter numerator of the t-1 frame, i.e., the previous frame.

In the formula (16), the first and second groups,

for the denominator of the updated length and width filters,

represents the denominator of the filter obtained by the new training of the current frame,

is the updated filter denominator for the t-1 frame, i.e., the previous frame.

In equations (15) and (16), η is a filter learning rate parameter.

As shown in fig. 4, a flowchart of steps when tracking an object in an image by using the scale-adaptive tracking method for image object aspect ratio change in the present application is provided.

In the above scale adaptive tracking method for the change of the aspect ratio of the image target, the method utilizes the length and width filters to realize the length and width scale adaptation of the target, and is distinguished from the above mentioned tracking method in which the DSST can only estimate the change of the target scale and cannot adapt to the change of the target aspect ratio. The above-mentioned scale estimation method can only proportionally enlarge or reduce the target frame, and cannot accurately adapt to the aspect ratio change of the target. Therefore, when the target is tracked, the tracking drift and even the failure often occur when the target encounters the remarkable appearance change caused by the posture change. Aiming at the condition that the relative distance and the relative azimuth angle between a target and a background are changed continuously, the method and the device estimate the scale change of the target in the length direction and the width direction by utilizing two independent filters to realize the accurate tracking of the target. The target length-width ratio changing scale self-adaptive tracking method meets the real-time performance of target tracking, and meanwhile improves the success rate and the accuracy rate of target tracking. The application further expands the application range of the target tracking method, can be used for tracking the ground moving target, and has important theoretical research significance and wide application prospect for tracking the target of the fast moving aircraft.

It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 5, there is provided a scale-adaptive tracking apparatus for image target aspect ratio variation, including: a target data set obtaining module 200, a feature extraction module 210, a position estimation module 220, a target frame estimation module 230, a current frame optimal filter obtaining module 240, and a filter updating module 250, wherein:

a target data set obtaining module 200, configured to obtain a target data set to be tracked, where the target data set includes multiple frames of target images arranged in a time sequence;

the feature extraction module 210 is configured to, when performing target tracking on multiple target images subsequent to a second frame of target image, extract a length spatial feature and a width spatial feature in a current frame of target image according to an estimated target frame obtained from a previous frame of target image, determine a current frame candidate region according to an estimated center position coordinate of a target in the previous frame of target image, and extract a position feature in the current frame of target image according to the current frame candidate region;

the position estimation module 220 is configured to perform correlation operation on the position feature and an updated position filter obtained from a previous frame of target image to obtain an estimated center position coordinate of a target in a current frame of image;

the target frame estimation module 230 is configured to perform correlation operation on the length spatial feature and the width spatial feature respectively with an updated length filter and an updated width filter obtained from a previous frame of target image to obtain an estimated target frame of a target in a current frame of target image;

a current frame optimal filter obtaining module 240, configured to determine a new candidate region according to the estimated center position coordinates of the target in the current frame target image, construct and train according to the position features extracted by the new candidate region in the current frame target image to obtain a new position filter, and construct and train according to the length space features and the width space features extracted by the estimated target frame of the target in the current frame target image to obtain a new length filter and a new width filter;

and a filter updating module 250, configured to iteratively update the updated position filter, the updated length direction filter, and the updated width filter obtained from the previous frame of target image according to the new position filter, the new length filter, and the new width filter obtained from the current frame of target image, so as to obtain an updated position filter, an updated length direction filter, and an updated width filter for application to the next frame of target image.

The specific definition of the scale adaptive tracking apparatus for the aspect ratio change of the image target can refer to the above definition of the scale adaptive tracking method for the aspect ratio change of the image target, and is not described herein again. The modules in the above-mentioned scale-adaptive tracking apparatus for the aspect ratio change of the image target can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method for scale-adaptive tracking of changes in aspect ratios of image objects. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on a shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, performs the steps of:

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A scale adaptive tracking method for image target aspect ratio change is characterized by comprising the following steps:

determining a new candidate area according to the estimated central position coordinates of the target in the current frame target image, constructing and training according to the position characteristics of the new candidate area extracted from the current frame target image to obtain a new position filter, and constructing and training according to the length space characteristics and the width space characteristics of the estimated target frame of the target in the current frame target image to obtain a new length filter and a new width filter;

2. The scale-adaptive tracking method according to claim 1, wherein, in between target tracking for multiple target images after a second target image, processing a first target image in the target data set further includes:

3. The scale-adaptive tracking method of claim 2, wherein processing the first frame target image followed by processing the second frame target image comprises:

4. The adaptive scale tracking method according to claim 3, wherein when determining the candidate region of the first frame according to the initial target in the target image of the first frame, if the size of the target frame is larger than ten thousand pixels, the image region where the candidate region of the first frame is located is downsampled so that the size of the target frame is smaller than ten thousand pixels.

5. The adaptive scale tracking method according to claim 3, wherein when each frame of target image is processed, the length space feature and the width space feature extracted according to the target frame and the position feature extracted according to the candidate region are fusion features of both the histogram of oriented gradients feature and the image gray scale feature.

6. The adaptive scale tracking method of claim 3, wherein constructing and training a position filter according to the position features during processing of each frame of the target image comprises:

7. The adaptive scale tracking method of claim 6, wherein constructing and training a position filter according to the position features during processing of each frame of the target image further comprises:

8. A device for scale-adaptive tracking of changes in aspect ratio of an image target, the device comprising:

the characteristic extraction module is used for extracting length space characteristics and width space characteristics in a current frame image according to an estimated target frame obtained from a previous frame target image when target tracking is carried out on a plurality of frame target images behind a second frame target image, determining a current frame candidate area according to the estimated center position coordinates of a target in the previous frame target image, and extracting position characteristics in the current frame target image according to the current frame candidate area;

and the filter updating module is used for respectively carrying out iterative updating on the updated position filter, the updated length direction filter and the updated width filter obtained from the previous frame of target image according to the new position filter, the new length filter and the new width filter obtained from the current frame of target image so as to correspondingly obtain the updated position filter, the updated length direction filter and the updated width filter applied to the next frame of image.