CN111860189B

CN111860189B - Target tracking method and device

Info

Publication number: CN111860189B
Application number: CN202010588437.7A
Authority: CN
Inventors: 侯棋文; 张樯; 崔洪; 张蛟淏
Original assignee: Beijing Institute of Environmental Features
Current assignee: Beijing Institute of Environmental Features
Priority date: 2020-06-24
Filing date: 2020-06-24
Publication date: 2024-01-19
Anticipated expiration: 2040-06-24
Also published as: CN111860189A

Abstract

The invention discloses a target tracking method and device, and relates to the technical field of image processing. Wherein the method comprises the following steps: determining candidate areas of each block of the target in the current frame image; inputting the image features obtained by the compressed sampling of the block candidate regions into a trained classifier to obtain category prediction scores of the block candidate regions; screening out the area where the block is located according to the category prediction score of the block candidate area; and under the condition that the blocked block exists, calculating the position coordinates of the target in the current frame image according to the position coordinates of the other blocks except the blocked block. Through the steps, the problems of poor tracking effect and weak tracking algorithm robustness caused by shielding and scale change in the existing compression tracking algorithm can be solved.

Description

Target tracking method and device

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a target tracking method and apparatus.

Background

In many photoelectric tracking search systems, the real-time performance and accuracy of a tracking algorithm are required to be high due to the fact that targets move fast and the target scale changes greatly. Therefore, developing a fast and efficient tracking algorithm is a fundamental need for an optoelectronic tracking search system.

The compression tracking algorithm is a simple and fast tracking algorithm, and has been attracting much attention once proposed. However, this algorithm has certain limitations. First, compression tracking is not robust enough for occlusion, and when an image is occluded, the tracking effect is poor. In addition, the tracking window of the compression tracking algorithm is fixed and is not robust to scale changes.

Therefore, aiming at the defects, a new scheme needs to be provided to solve the problems of poor tracking effect and weak tracking algorithm robustness caused by shielding and scale change in the existing compression tracking algorithm.

Disclosure of Invention

First, the technical problem to be solved

The invention aims to solve the technical problems of poor tracking effect and weak tracking algorithm robustness caused by shielding and scale change in the existing compression tracking algorithm.

(II) technical scheme

In order to solve the technical problems, in one aspect, the present invention provides a target tracking method.

The target tracking method of the invention comprises the following steps: determining candidate areas of each block of the target in the current frame image; performing compressed sampling on the candidate areas of the blocks, and inputting image features obtained by the compressed sampling into a trained classifier to obtain category prediction scores of the candidate areas of the blocks; screening the area where the block is located from the candidate areas of the block according to the category prediction scores of the candidate areas of the block; and under the condition that the blocked partitioned area exists, calculating the position coordinates of the target in the current frame image according to the position coordinates of the areas where other partitioned areas except the blocked partitioned area are located.

Optionally, the method further comprises: determining a candidate region of a target in a current frame image according to the position coordinates of the target in the current frame image and the scale of the target in a previous frame image; performing compression sampling and normalization processing on the candidate region of the target, and inputting the processed image features into a trained classifier to obtain a category prediction score of the candidate region of the target; and screening the region where the target is located from the candidate regions of the target according to the category prediction scores of the candidate regions of the target, and taking the scale of the region where the target is located as the scale of the target in the current frame image.

Optionally, the performing compressive sampling on the candidate region of the block, and inputting the image features obtained by the compressive sampling into a trained classifier, so as to obtain a category prediction score of the candidate region of the block, where the category prediction score comprises: extracting a plurality of Haar features from each candidate region of the block, and taking the plurality of Haar features as image features obtained by compressive sampling; and inputting the plurality of Haar features into a trained first Bayesian classifier to obtain a category prediction score of the candidate region.

Optionally, the response of the trained first bayesian classifier satisfies:

wherein H is _i (y _i ) A class prediction score of a candidate region of the i-th block, i=1, 2, …, N being the total number of blocks; p (y) _ij |k=1) denotes the feature y _ij A predicted probability value for the target feature; p (y) _ij |k=0) represents the feature y _ij A predicted probability value for a background feature; y is _ij Is the j-th Haar feature in the candidate region of the i-th partition; l is the number of Haar features in the candidate region of each tile; w (w) _ij Is y _ij Weights, w _ij Is defined by the central coordinates of Haar featuresDetermining; (x) _c ,y _c ) Is the center coordinates of the entire target; beta is a constant related to the diagonal of the target region.

Optionally, the method further comprises: and extracting some image feature sequences from the region which is close to the target in the previous frame of image to serve as positive samples, extracting some image feature sequences from the region which is far away from the target in the previous frame of image to serve as negative samples, and training the first Bayesian classifier according to the positive samples and the negative samples to obtain the trained first Bayesian classifier.

Optionally, whether the blocked partitioned area exists is determined according to the following manner: constructing a block characteristic vector used for clustering according to the offset of the position coordinate of the region where the block is located compared with the position coordinate of the region where the corresponding block is located in the previous frame of image and the category prediction score of the region where the block is located; and clustering each block according to the block feature vector used for clustering, and judging whether a blocked block area exists according to a clustering result.

Optionally, the determining the candidate region of the target in the current frame image according to the position coordinate of the target in the current frame image and the scale of the target in the previous frame image includes: constructing a reference area by taking the position coordinate of the target in the current frame image as the center of the reference area and taking the scale of the target in the previous frame image as the scale of the reference area; and performing enlargement and reduction processing on the reference region to obtain a candidate region of the target.

Optionally, the performing compressive sampling and normalization processing on the candidate region of the target, and inputting the image features obtained by the processing into a trained classifier, so as to obtain a category prediction score of the candidate region of the target includes: extracting a plurality of Haar features from each candidate region of the target, carrying out normalization processing on the plurality of Haar features, and inputting the processed image features into a trained second Bayesian classifier to obtain a category prediction score of the candidate region.

Optionally, the method further comprises: and under the condition that the blocked partitioned area does not exist, calculating the position coordinates of the target in the current frame image according to the position coordinates of the area where each partitioned area of the target is located.

In order to solve the technical problems, another aspect of the present invention provides an object tracking device.

The object tracking device of the present invention includes: the determining module is used for determining candidate areas of each block of the target in the current frame image; the screening module is used for carrying out compressed sampling on the candidate areas of the blocks, and inputting image features obtained by the compressed sampling into a trained classifier so as to obtain category prediction scores of the candidate areas of the blocks; screening the area where the block is located from the candidate areas of the block according to the category prediction scores of the candidate areas of the block; and the position calculation module is used for determining the position coordinates of the target in the current frame image according to the position coordinates of the areas where the other blocks except the blocked block areas are located under the condition that the blocked block areas are judged to exist.

Optionally, the apparatus further comprises: the scale calculation module is used for determining a candidate region of the target in the current frame image according to the position coordinate of the target in the current frame image and the scale of the target in the previous frame image; the scale calculation module is further used for performing compression sampling and normalization processing on the candidate region of the target, and inputting the processed image features into a trained classifier to obtain a category prediction score of the candidate region of the target; the scale calculation module is further configured to screen an area where the target is located from the candidate areas of the target according to the category prediction score of the candidate areas of the target, and take the scale of the area where the target is located as the scale of the target in the current frame image.

(III) beneficial effects

The technical scheme of the invention has the following advantages: the method comprises the steps of determining candidate areas of each block of a target in a current frame image, performing compression sampling on the candidate areas of the blocks, inputting image features obtained by the compression sampling into a trained classifier to obtain category prediction scores of the candidate areas of the blocks, screening out areas where the blocks are located from the candidate areas of the blocks according to the category prediction scores of the candidate areas of the blocks, and calculating position coordinates of the target in the current frame image according to position coordinates of areas where other blocks except the blocked block areas are located under the condition that the blocked block areas are judged to exist.

Drawings

FIG. 1 is a schematic flow chart of a target tracking method according to a first embodiment of the invention;

FIG. 2 is a schematic flow chart of a target tracking method according to a second embodiment of the invention;

FIG. 3 is a schematic representation of a Haar feature in an embodiment of the present invention;

FIG. 4 is a schematic diagram of the main modules of the object tracking device in the third embodiment of the present invention;

fig. 5 is a schematic diagram of the main modules of the object tracking device in the fourth embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1

FIG. 1 is a schematic flow chart of a target tracking method according to a first embodiment of the invention; as shown in fig. 1, the target tracking method provided by the embodiment of the invention includes:

step S101, determining candidate areas of each block of the target in the current frame image.

In one alternative example, candidate regions for each tile of the object in the current frame image may be determined based on object tracking results in the previous frame image, such as the scale and position of the object in the previous frame image. For example, the current frame image may be divided into a plurality of blocks according to the sizes of the blocks in the previous frame image and the position coordinates of the blocks, and searching may be performed by moving a search frame in each block and its neighboring regions, where the search area selected by each search frame may be regarded as a candidate area of the block.

Step S102, performing compressed sampling on the candidate areas of the blocks, and inputting image features obtained by the compressed sampling into a trained classifier to obtain category prediction scores of the candidate areas of the blocks; and screening the region where the block is located from the candidate regions of the block according to the category prediction score of the candidate regions of the block.

In this step, each candidate region in each partition of the target may be separately compression sampled. In one alternative example, compressive sampling of the candidate region includes: and carrying out convolution operation on different positions of the candidate region in neighborhoods with different sizes, taking a convolution result as a first image feature (or called as a 'high-dimensional convolution feature'), and then calculating a projection matrix for the first image feature to obtain a second image feature, namely, an image feature obtained by compression sampling. The second image feature essentially corresponds to a linear combination of a sum of pixel values of a plurality (e.g. 2 or 3 or other values) of rectangular areas. That is, the second image feature, i.e., the image feature obtained by compressive sampling, can be obtained by linearly combining the pixel value sums of the plurality of matrix regions, and the formula thereof can be expressed as: .

Wherein X is _j The value of j is 1,2, … s, s is the number of rectangular areas participating in linear combination; y is _i The second image features obtained by the formula (1) can be obtained by accelerating calculation by using an integral graph in the specific implementation; a, a _ij Is a weight coefficient.

Haar features are one image feature based on equation (1), and four Haar features are commonly used as shown in fig. 3. The Haar feature is the difference between each pixel sum of the white area and each pixel sum of the black area in fig. 3. In an alternative example, a plurality of Haar features may be extracted in each candidate region of the block and input into a trained classifier as image features obtained by compressive sampling to obtain a class prediction score for the candidate region. The trained classifier can be a trained Bayesian classifier, a classifier based on decision trees, a classifier based on neural networks, or the like, and the trained classifier can predict whether a candidate region is a target or a background.

In an alternative embodiment, the class prediction score of the candidate region is specifically a prediction score of the target of the class of the candidate region. In this alternative embodiment, the candidate region with the highest category prediction score may be selected from the candidate regions of one block, and the candidate region with the highest category prediction score may be used as the region in which the block is located.

In another alternative embodiment, the class prediction score of the candidate region is specifically a prediction score of the background for the class of the candidate region. In this alternative embodiment, the candidate region with the lowest category prediction score may be selected from the candidate regions of one block, and the candidate region with the lowest category prediction score may be used as the region in which the block is located.

In the embodiment of the present invention, the area where each block of the target in the current frame image is located may be determined through step S101 and step S102. Furthermore, the position coordinates of the region where each block of the object is located in the current frame image can be determined.

Step S103, calculating the position coordinates of the target in the current frame image according to the position coordinates of the areas where the other blocks except the blocked block areas are located when the blocked block areas are judged to exist.

For example, assuming that the target is divided into 5 blocks, respectively, the blocks A, B, C, D, E, if the block B is blocked, the position coordinates of the target in the current frame image are calculated from the position coordinates of the region in which the block a is located, the position coordinates of the region in which the block C is located, the position coordinates of the region in which the block D is located, and the position coordinates of the region in which the block E is located.

In the embodiment of the invention, the compressed sensing tracking algorithm and the block tracking algorithm are well combined through steps 101 to 103, meanwhile, whether the blocked block area exists or not is judged, and when the blocked block area exists, the position coordinates of the target in the current frame image are calculated through step 103, so that the real-time performance and accuracy of target tracking and the robustness of the tracking algorithm to blocking and scale change can be improved, and the problems of poor tracking effect and weak tracking algorithm robustness caused by blocking and scale change in the existing compressed tracking algorithm are solved.

Example two

FIG. 2 is a schematic flow chart of a target tracking method according to a second embodiment of the invention; as shown in fig. 2, the target tracking method according to the embodiment of the present invention includes:

step S201, determining candidate areas of each block of the target in the current frame image.

Step S202, performing compressed sampling on the candidate areas of the blocks, and inputting image features obtained by the compressed sampling into a trained classifier to obtain category prediction scores of the candidate areas of the blocks; and screening the areas of the blocks from the candidate areas of the blocks according to the category prediction scores of the candidate areas of the blocks.

Wherein X is _j The value of j is 1,2, … s, s is the number of rectangular areas participating in linear combination; y is _i The second image features obtained by the formula (1) can be obtained by accelerating calculation by using an integral graph in the specific implementation; a, a _ij Is a preset coefficient.

Haar features are one image feature based on equation (1), and four Haar features are commonly used as shown in fig. 3. The Haar feature is the difference between each pixel sum of the white area and each pixel sum of the black area in fig. 3. In one alternative example, a plurality of Haar features may be extracted in each candidate region of the block, and the plurality of Haar features input into a trained classifier to obtain a class prediction score for the candidate region.

In an alternative example, the trained classifier is a bayesian classifier (which may be denoted as a first bayesian classifier). In this alternative example, a plurality of Haar features respectively extracted in each candidate region of a block may be input into a trained first bayesian classifier to obtain a class prediction score for the candidate region. The category prediction score of the candidate region is specifically a prediction score of which the category is a target. After the class prediction score of each candidate region of a block is obtained, the candidate region with the highest class prediction score can be selected from each candidate region of the block, and the candidate region with the highest class prediction score is taken as the region where the block is located.

In an alternative embodiment, the first bayesian classifier may be a naive bayes classifier. The bayesian theorem is the mathematical basis of a naive bayes classifier, which assumes that each eigenvalue in the eigenvector is independent of the other eigenvalues. A na iotave bayes classifier may be constructed using the extracted features of the target. The response of the naive bayes classifier can be shown in the following formula (2):

where H (y) is the response of the naive bayes classifier.

In another alternative embodiment, to further improve the target tracking effect, the inventors of the present invention have improved the response of the first bayesian classifier, where the improved response of the first bayesian classifier satisfies:

In the response formula of the improved first Bayesian classifier, a weight w is set for each Haar feature in the partitioned candidate areas _ij The difference of contribution of Haar features extracted at different positions to the category prediction score is considered, so that the accuracy of the category prediction score is improved, and the accuracy of target tracking is improved.

Further, before step S202, the method according to the embodiment of the present invention may further include the following steps: and training the first Bayesian classifier to obtain a trained first Bayesian classifier. For example, when the response formula of the improved first bayesian classifier shown in the formula (3) is adopted, some image feature sequences can be extracted from a region, which is very close to the target, in the previous frame of image to serve as positive samples, some image feature sequences can be extracted from a region, which is far away from the target, in the previous frame of image to serve as negative samples, and the first bayesian classifier is trained according to the positive samples and the negative samples, so that the trained first bayesian classifier is obtained.

Step S203, judging whether the blocked partitioned area exists.

In one alternative example, whether there is an occluded blocked region may be determined according to the following: constructing a block characteristic vector used for clustering according to the offset of the position coordinate of the region where the block is located compared with the position coordinate of the region where the corresponding block is located in the previous frame of image and the category prediction score of the region where the block is located; and clustering each block according to the block feature vector used for clustering, and judging whether a blocked block area exists according to a clustering result.

Further, in an optional implementation manner of the above optional example, the offset Δx of the abscissa of the center point of the area where any block is located compared with the abscissa of the center point of the area where the corresponding block is located in the previous frame image may be calculated _i Comparing the ordinate of the central point of the region where the block is located with the ordinate of the central point of the region where the corresponding block is located in the previous frame of image _i Z obtained by normalizing the class prediction score of the region where the block is located _i Splicing to obtain the characteristic f of the block used for clustering _i I.e. the characteristics of the partition used for the clustering satisfy: f (f) _i ＝(Δx _i ,Δy _i ,z _i ). After the block feature vectors used for clustering are obtained, a preset clustering algorithm, such as a K-means clustering algorithm, can be adopted to perform clustering processing on each block. The clustering function used in the clustering process can be represented by the following formula:

Wherein f _i The feature vector of the ith block used for clustering, K is the number of clusters, mu _j S is the average distance from the feature vector of each block in the jth cluster to the cluster center point _j For the j-th cluster, j has a value ranging from 1 toK，The clustering result when the sum function takes the minimum value is expressed as the final clustering result. And then judging whether the blocked partitioned area exists or not according to the final clustering result.

In the case where it is determined that there is a blocked partitioned area through step S203, step S204 is performed; in the case where it is determined by step S203 that there is no blocked divided area, step S205 is executed.

And S204, calculating the position coordinates of the target in the current frame image according to the position coordinates of the areas where the other blocks except the blocked block area are located.

Step S205, calculating the position coordinates of the target in the current frame image according to the position coordinates of the areas of the target where the blocks are located.

For example, assuming that the target is divided into 5 blocks in total, each of the blocks A, B, C, D, E, if there is no blocked block, the position coordinates of the target in the current frame image are calculated from the position coordinates of the region in which the block a is located, the position coordinates of the region in which the block B is located, the position coordinates of the region in which the block C is located, the position coordinates of the region in which the block D is located, and the position coordinates of the region in which the block E is located.

Step S206, determining a candidate region of the target in the current frame image according to the position coordinates of the target in the current frame image and the scale of the target in the previous frame image.

Illustratively, in this step, the position coordinates of the object in the current frame image may be taken as the center of the reference area, and the scale of the object in the previous frame image may be taken as the scale of the reference areaDegree, build reference region s ₀ The method comprises the steps of carrying out a first treatment on the surface of the For the reference region s ₀ Performing enlargement or reduction processing to obtain a candidate region set S= { S of the target ₁ ,s ₂ ,L,s _N }. Further, if the reference area s ₀ Length of side L for scale ₀ Representation, then any one of the candidate regions s _i The scale of (c) can be expressed as: l (L) _i ＝a _i L ₀ Wherein a is _i The scale change coefficient corresponding to the ith candidate region of the target is represented, i has a value of 1 to N, and represents any one candidate region of the target.

Step S207, compression sampling and normalization processing are carried out on the candidate region of the target, and the image features obtained through processing are input into a trained classifier so as to obtain a category prediction score of the candidate region of the target.

The trained classifier in step S207 may be a trained bayesian classifier, a classifier based on a decision tree, or a classifier based on a neural network, and the trained classifier may predict whether the candidate region is a target or a background.

In an alternative example, the trained classifier in step S207 is a bayesian classifier (which may be denoted as a second bayesian classifier). In this alternative example, a plurality of Haar features may be extracted from each candidate region of the target, then normalized, and the processed image features may be input into a trained second bayesian classifier to obtain a class prediction score for the candidate region.

Further, in the above optional example, the normalizing the plurality of Haar features may use the following formula:

wherein y is _i Haar features representing the i-th candidate region of the object; y' _i And (5) representing Haar characteristics of the ith candidate region of the normalized target.

Further, in the above optional example, the second bayesian classifier may be a naive bayesian classifier. The bayesian theorem is the mathematical basis of a naive bayes classifier, which assumes that each eigenvalue in the eigenvector is independent of the other eigenvalues. A na iotave bayes classifier may be constructed using the extracted features of the target. Wherein the response of the naive bayes classifier can be as shown in the above equation (2).

Step S208, according to the category prediction score of the candidate region of the target, the region where the target is located is screened out from the candidate region of the target, and the scale of the region where the target is located is used as the scale of the target in the current frame image.

In an alternative embodiment, the class prediction score of the candidate region is specifically a prediction score of the target of the class of the candidate region. After obtaining the category prediction scores of the candidate areas of the target, the candidate area with the highest category prediction score can be selected from the candidate areas, and the scale of the candidate area with the highest category prediction score is taken as the scale of the target in the current frame image.

In another alternative embodiment, the class prediction score of the candidate region is specifically a prediction score of the background for the class of the candidate region. After obtaining the category prediction scores of the candidate areas of the target, selecting the candidate area with the lowest category prediction score from the candidate areas, and taking the scale of the candidate area with the lowest category prediction score as the scale of the target in the current frame image.

Further, the method of the embodiment of the invention can further comprise the following steps: after the target tracking result of the current frame image is obtained, updating the classifier to adapt the parameters of the Bayesian classifier to the change of the environment and the target.

In the embodiment of the invention, the position and the scale of the target in the current frame image can be accurately and real-timely determined through the steps. By iteratively executing the steps for each frame of image, the continuous tracking of the target can be realized, the continuous tracking requirement of a photoelectric search tracking system and a video monitoring system for common targets (such as targets of people, vehicles and the like) is met, and the continuous stable tracking can be realized only by giving the position and the size of the first frame of target. Compared with the prior art, the method of the embodiment of the invention improves the real-time performance and accuracy of target tracking and the robustness of the tracking algorithm to the shielding and the scale change, and solves the problems of poor tracking effect and weak tracking algorithm robustness caused by the shielding and the scale change in the existing compression tracking algorithm.

Example III

Fig. 4 is a schematic diagram of the main modules of the object tracking device in the third embodiment of the present invention. As shown in fig. 4, the object tracking device 400 according to the embodiment of the present invention includes: a determining module 401, a screening module 402, and a position calculating module 403.

A determining module 401, configured to determine candidate areas of each block of the target in the current frame image.

In an alternative example, the determining module 401 may determine candidate regions for each tile of the object in the current frame image based on the object tracking results in the previous frame image, such as the scale and location of the object in the previous frame image. For example, the determining module 401 may divide the current frame image into a plurality of blocks according to the sizes of the blocks in the previous frame image and the position coordinates of the blocks, and search each block and its neighboring areas by moving the search frame, where the determining module 401 may consider the search area selected by each search frame as a candidate area of the block.

The screening module 402 is configured to perform compressive sampling on the candidate region of the block, and input image features obtained by the compressive sampling into a trained classifier to obtain a class prediction score of the candidate region of the block; the screening module 402 is further configured to screen, according to the class prediction score of the candidate region of the block, a region where the block is located from the candidate regions of the block.

The filtering module 402 may sample each candidate region in each partition of the target separately in compression. In an alternative example, the filtering module 402 performs compressive sampling on the candidate region includes: and carrying out convolution operation on different positions of the candidate region in neighborhoods with different sizes, taking a convolution result as a first image feature (or called as a 'high-dimensional convolution feature'), and then calculating a projection matrix for the first image feature to obtain a second image feature, namely, an image feature obtained by compression sampling. The second image feature essentially corresponds to a linear combination of a sum of pixel values of a plurality (e.g. 2 or 3 or other values) of rectangular areas. That is, the second image feature, i.e., the image feature obtained by compressive sampling, can be obtained by linearly combining the pixel value sums of the plurality of matrix regions, and the formula thereof can be expressed as: .

Haar features are one image feature based on equation (1), and four Haar features are commonly used as shown in fig. 3. The Haar feature is the difference between each pixel sum of the white area and each pixel sum of the black area in fig. 3. In an alternative example, the screening module 402 may extract a plurality of Haar features in each candidate region of the block, and input the plurality of Haar features into a trained classifier to obtain a class prediction score for the candidate region. The trained classifier can be a trained Bayesian classifier, a classifier based on decision trees, a classifier based on neural networks, or the like, and the trained classifier can predict whether a candidate region is a target or a background.

In an alternative embodiment, the class prediction score of the candidate region is specifically a prediction score of the target of the class of the candidate region. In this alternative embodiment, the filtering module 402 may select a candidate region with the highest category prediction score from the candidate regions of one block, and take the candidate region with the highest prediction score as the region in which the block is located.

In another alternative embodiment, the class prediction score of the candidate region is specifically a prediction score of the background for the class of the candidate region. In this alternative embodiment, the filtering module 402 may select a candidate region with the lowest category prediction score from the candidate regions of a block, and use the candidate region with the lowest prediction score as the region in which the block is located.

The position calculating module 403 is configured to calculate, when it is determined that there is an occluded segmented region, position coordinates of a target in the current frame image according to position coordinates of regions where other segments except the occluded segmented region are located.

For example, assuming that the target is divided into 5 blocks, respectively, the blocks A, B, C, D, E, if the block B is blocked, the position calculation module 403 calculates the position coordinates of the target in the current frame image according to the position coordinates of the region where the block a is located, the position coordinates of the region where the block C is located, the position coordinates of the region where the block D is located, and the position coordinates of the region where the block E is located.

In the embodiment of the invention, the compressed sensing tracking algorithm and the block tracking algorithm are well combined through the device, meanwhile, whether the blocked block area exists or not is judged, and when the blocked block area exists, the position coordinates of the target in the current frame image are calculated through the position coordinates of areas where other blocks except the blocked block area exist, so that the real-time performance and the accuracy of target tracking and the robustness of the tracking algorithm to blocking and scale change can be improved, and the problems of poor tracking effect and poor tracking algorithm robustness caused by blocking and scale change in the existing compressed tracking algorithm are solved.

Example IV

Fig. 5 is a schematic diagram of the main modules of the object tracking device in the fourth embodiment of the present invention. As shown in fig. 5, the object tracking device 500 according to the embodiment of the present invention includes: a determining module 501, a screening module 502, a position calculating module 503 and a scale calculating module 504.

A determining module 501 is configured to determine candidate areas of each block of the target in the current frame image.

With respect to the determination module 501, reference may be made to the exemplary illustration of the embodiment shown in fig. 4 for specific determination of candidate regions for respective tiles of the object in the current frame image.

The screening module 502 is configured to perform compressive sampling on the candidate areas of the blocks, and input image features obtained by the compressive sampling into a trained classifier to obtain a class prediction score of the candidate areas of the blocks; the screening module 502 is further configured to screen, according to the class prediction score of the candidate region of the block, a region where the block is located from the candidate regions of the block.

In an alternative example, the filtering module 502 performs compressive sampling on the candidate region of the block, and inputs the image features obtained by the compressive sampling into a trained classifier, so as to obtain a class prediction score of the candidate region of the block, where the class prediction score includes: the screening module 502 extracts a plurality of Haar features from each candidate region of the block, and takes the plurality of Haar features as image features obtained by compressed sampling; the screening module 502 inputs the plurality of Haar features into a trained first bayesian classifier to obtain a class prediction score for the candidate region.

Further, in the above alternative example, the category prediction score of the candidate region may be specifically a prediction score that the category of the candidate region is a target. After obtaining the class prediction scores of the candidate regions of a block, the screening module 502 may select the candidate region with the highest class prediction score from the candidate regions of the block, and take the candidate region with the highest class prediction score as the region where the block is located.

A position calculating module 503, configured to determine, when it is determined that there is an occluded segmented region, position coordinates of a target in a current frame image according to position coordinates of regions where other segments except the occluded segmented region are located; the position calculating module 503 is further configured to calculate, when it is determined that there is no blocked block area, a position coordinate of the target in the current frame image according to a position coordinate of an area where each block of the target is located.

For example, assuming that the target is divided into 5 blocks, namely, blocks A, B, C, D, E, if a block B is blocked, calculating the position coordinates of the target in the current frame image according to the position coordinates of the region where the block a is located, the position coordinates of the region where the block C is located, the position coordinates of the region where the block D is located, and the position coordinates of the region where the block E is located; if the blocked block does not exist, calculating the position coordinate of the target in the current frame image according to the position coordinate of the area where the block A is located, the position coordinate of the area where the block B is located, the position coordinate of the area where the block C is located, the position coordinate of the area where the block D is located and the position coordinate of the area where the block E is located.

The scale calculation module 504 is configured to determine a candidate region of the target in the current frame image according to the position coordinate of the target in the current frame image and the scale of the target in the previous frame image.

Illustratively, the determining, by the scale calculation module 504, the candidate region of the target in the current frame image according to the position coordinates of the target in the current frame image and the scale of the target in the previous frame image includes: the scale calculation module 504 may construct a reference region s by taking the position coordinate of the target in the current frame image as the center of the reference region and the scale of the target in the previous frame image as the scale of the reference region ₀ The method comprises the steps of carrying out a first treatment on the surface of the The scale calculation module 504 calculates the reference region s ₀ Performing enlargement or reduction processing to obtain a candidate region set S= { S of the target ₁ ,s ₂ ,L,s _N }. Further, if the reference area s ₀ Length of side L for scale ₀ Representation, then any one of the candidate regions s _i The scale of (c) can be expressed as: l (L) _i ＝a _i L ₀ Wherein a is _i Representing the scale factor.

The scale calculation module 504 is further configured to perform compressive sampling and normalization processing on the candidate region of the target, and input the image features obtained by the processing into a trained classifier to obtain a class prediction score of the candidate region of the target.

The trained classifier used by the scale computation module 504 may be a trained bayesian classifier, a classifier based on a decision tree, or a classifier based on a neural network, etc., where the trained classifier predicts whether the candidate region is a target or a background.

In an alternative example, the trained classifier used by the metric-computation module 504 is a Bayesian classifier (which may be referred to as a second Bayesian classifier). In this alternative example, the scale calculation module 504 may extract a plurality of Haar features in each candidate region of the target, then normalize the plurality of Haar features, and input the processed image features into a trained second bayesian classifier to obtain a class prediction score for the candidate region.

Further, in the above alternative example, the normalization processing performed by the scale calculation module 504 on the plurality of Haar features may use the following formula:

The scale calculation module 504 is further configured to screen, according to the class prediction score of the candidate region of the target, a region where the target is located from the candidate regions of the target, and take the scale of the region where the target is located as the scale of the target in the current frame image.

In an alternative embodiment, the class prediction score of the candidate region is specifically a prediction score of the target of the class of the candidate region. After obtaining the class prediction scores of the candidate regions of the target, the scale calculation module 504 may pick the candidate region with the highest class prediction score from the candidate regions, and take the scale of the candidate region with the highest class prediction score as the scale of the target in the current frame image.

In another alternative embodiment, the class prediction score of the candidate region is specifically a prediction score of the background for the class of the candidate region. After obtaining the class prediction scores of the candidate regions of the target, the scale calculation module 504 may pick the candidate region with the lowest class prediction score from the candidate regions, and take the scale of the candidate region with the lowest class prediction score as the scale of the target in the current frame image.

In the embodiment of the invention, the position and the scale of the target in the current frame image can be accurately and real-timely determined through the device. By iteratively calling each module in the device for each frame of image, the continuous tracking of the target can be realized, the continuous tracking requirement of a photoelectric searching and tracking system and a video monitoring system on common targets (such as targets of people, vehicles and the like) is met, and the continuous stable tracking can be realized only by giving the position and the size of the target of the first frame. Compared with the prior art, the device of the embodiment of the invention improves the real-time performance and accuracy of target tracking and the robustness of the tracking algorithm to the shielding and the scale change, and solves the problems of poor tracking effect and weak tracking algorithm robustness caused by the shielding and the scale change in the existing compression tracking algorithm.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method of target tracking, the method comprising:

determining candidate areas of each block of the target in the current frame image; dividing the current frame image into a plurality of blocks according to the sizes of the blocks in the previous frame image and the position coordinates of the blocks, and searching in each block and the adjacent areas thereof in a mode of moving a search frame, wherein a search area selected by the search frame each time is regarded as a candidate area of the block;

performing compressed sampling on the candidate areas of the blocks, and inputting image features obtained by the compressed sampling into a trained classifier to obtain category prediction scores of the candidate areas of the blocks; screening the area where the block is located from the candidate areas of the block according to the category prediction scores of the candidate areas of the block;

constructing a block characteristic vector used for clustering according to the offset of the position coordinate of the region where the block is located compared with the position coordinate of the region where the corresponding block is located in the previous frame of image and the category prediction score of the region where the block is located; clustering each block according to the block feature vector used by the clustering, and judging whether a blocked block area exists according to a clustering result; under the condition that the blocked blocking area exists, calculating the position coordinates of the target in the current frame image according to the position coordinates of areas where other blocking areas except the blocked blocking area are located;

Constructing a reference area according to the position coordinates of the target in the current frame image and the scale of the target in the previous frame image; performing enlargement and reduction processing on the reference area to determine a candidate area of the target in the current frame image; performing compression sampling and normalization processing on the candidate region of the target, and inputting the processed image features into a trained classifier to obtain a category prediction score of the candidate region of the target; according to the category prediction score of the candidate region of the target, the region where the target is located is screened out from the candidate region of the target, and the scale of the region where the target is located is used as the scale of the target in the current frame image; after the target tracking result of the current frame image is obtained, updating the classifier to adapt the parameters of the Bayesian classifier to the change of the environment and the target.

2. The method of claim 1, wherein the compressive sampling of the segmented candidate regions and inputting the compressive sampled image features into a trained classifier to obtain class prediction scores for the segmented candidate regions comprises:

Extracting a plurality of Haar features from each candidate region of the block, and taking the plurality of Haar features as image features obtained by compressive sampling; and inputting the plurality of Haar features into a trained first Bayesian classifier to obtain a category prediction score of the candidate region.

3. The method according to claim 2, wherein the response of the trained first bayesian classifier satisfies:

4. The method according to claim 2, wherein the method further comprises:

and extracting some image feature sequences from the region which is close to the target in the previous frame of image to serve as positive samples, extracting some image feature sequences from the region which is far away from the target in the previous frame of image to serve as negative samples, and training the first Bayesian classifier according to the positive samples and the negative samples to obtain the trained first Bayesian classifier.

5. The method of claim 1, wherein the compressive sampling and normalizing the candidate regions of the object and inputting the processed image features into a trained classifier to obtain a class prediction score for the candidate regions of the object comprises:

extracting a plurality of Haar features from each candidate region of the target, carrying out normalization processing on the plurality of Haar features, and inputting the processed image features into a trained second Bayesian classifier to obtain a category prediction score of the candidate region.

6. The method according to claim 1, wherein the method further comprises:

and under the condition that the blocked partitioned area does not exist, calculating the position coordinates of the target in the current frame image according to the position coordinates of the area where each partitioned area of the target is located.

7. An object tracking device, the device comprising:

the determining module is used for determining candidate areas of each block of the target in the current frame image; dividing the current frame image into a plurality of blocks according to the sizes of the blocks in the previous frame image and the position coordinates of the blocks, and searching in each block and the adjacent areas thereof in a mode of moving a search frame, wherein a search area selected by the search frame each time is regarded as a candidate area of the block;

The screening module is used for carrying out compressed sampling on the candidate areas of the blocks, and inputting image features obtained by the compressed sampling into a trained classifier so as to obtain category prediction scores of the candidate areas of the blocks; screening the area where the block is located from the candidate areas of the block according to the category prediction scores of the candidate areas of the block;

constructing a block characteristic vector used for clustering according to the offset of the position coordinate of the region where the block is located compared with the position coordinate of the region where the corresponding block is located in the previous frame of image and the category prediction score of the region where the block is located; clustering each block according to the block feature vector used by the clustering, and judging whether a blocked block area exists according to a clustering result;

the position calculation module is used for determining the position coordinates of the target in the current frame image according to the position coordinates of the areas where the other blocks except the blocked block areas are located under the condition that the blocked block areas are judged to exist;

the apparatus further comprises:

the scale calculation module is used for constructing a reference area according to the position coordinates of the target in the current frame image and the scale of the target in the previous frame image; performing enlargement and reduction processing on the reference area to determine a candidate area of the target in the current frame image; the scale calculation module is further used for performing compression sampling and normalization processing on the candidate region of the target, and inputting the processed image features into a trained classifier to obtain a category prediction score of the candidate region of the target; the scale calculation module is further used for screening out the region where the target is located from the candidate regions of the target according to the category prediction scores of the candidate regions of the target, and taking the scale of the region where the target is located as the scale of the target in the current frame image; after the target tracking result of the current frame image is obtained, updating the classifier to adapt the parameters of the Bayesian classifier to the change of the environment and the target.