CN112069967A

CN112069967A - Night-vision anti-halation pedestrian detection and tracking method based on heterogeneous video fusion

Info

Publication number: CN112069967A
Application number: CN202010896881.5A
Authority: CN
Inventors: 郭全民; 张文平; 田英侠; 柴改霞; 杨建华; 陈阳
Original assignee: Xian Technological University
Current assignee: Xian Technological University
Priority date: 2020-08-31
Filing date: 2020-08-31
Publication date: 2020-12-11
Anticipated expiration: 2040-08-31
Also published as: CN112069967B

Abstract

The invention discloses a night vision anti-halation pedestrian detection and tracking method based on heterogeneous video fusion. The method selects detection frames in a self-adaptive manner according to the condition of an interframe difference threshold and a maximum interframe separation threshold, wherein the interframe difference threshold is determined by the correlation of a feature vector cosine included angle threshold of two frames of images, the number of interval frames and the visual effect of the detection frames, so that the number of the detection frames can be reduced to the maximum extent on the premise of meeting the visual characteristics of human eyes, the processing efficiency of night-vision anti-halation pedestrian detection is improved, and the problems of missed detection and multiple detection caused by over-small difference due to over-large content difference of video frames during intermittent cyclic detection-tracking can be solved; the maximum frame separation threshold is introduced, the problems of untimely updating, tracking loss error and the like caused by too small content difference of video frames during detection and tracking can be solved, and the precision and fault tolerance of night vision anti-halation pedestrian detection are improved.

Description

Night-vision anti-halation pedestrian detection and tracking method based on heterogeneous video fusion

Technical Field

The invention belongs to the technical field of night vision anti-halation, particularly relates to a self-adaptive intermittent cyclic detection-tracking method for night vision anti-halation, and particularly relates to a night vision anti-halation pedestrian detection and tracking method with heterogeneous video fusion.

Background

The night vision anti-halation technology of the heterogeneous image fusion combines the advantages of no halation of the infrared image and rich color detail information of the visible image, provides a new approach for solving the problem of halation during night driving, and has good application prospect.

The night vision anti-halation method for the heterogeneous image fusion eliminates halation in a night vision image through an infrared and visible light image fusion technology, improves color and detail information of the image, improves imaging quality of the night vision image, develops from single transformation fusion of early wavelets, color spaces and the like to a composite fusion method combining color spaces with transformation of multi-resolution, multi-scale and the like, obtains a fusion image with more thorough halation elimination and richer detail colors, but still faces two challenges when being applied to night pedestrian detection: firstly, although the composite fusion method has good effect, the algorithm complexity is high, so that the pedestrian detection and tracking processing efficiency of the whole system is low, and the speed is difficult to meet the actual requirement; and secondly, although the fused image has no halation compared with the original image and has rich detail color information in a dark place, the fused image has the problems of missing detection of pedestrians, low tracking precision and the like due to insufficient light at night and poor imaging conditions and larger difference compared with the daytime image quality.

Disclosure of Invention

Aiming at the problems of low processing speed and low precision of the traditional heterogeneous video fusion night vision anti-halation pedestrian detection and tracking method, the invention designs the heterogeneous video fusion night vision anti-halation pedestrian detection and tracking method to improve the speed and precision of pedestrian detection and tracking in a night vision halation dynamic scene.

In order to achieve the purpose, the technical scheme of the invention is as follows:

a night vision anti-halation pedestrian detection and tracking method based on heterogeneous video fusion comprises the following steps:

step 1, self-adaptively selecting a detection frame, wherein the step specifically comprises the following steps:

step 1.1, extracting a video frame, and determining a1 st frame of a video sequence as a reference frame and a 2 nd frame of the video sequence as a current frame;

step 1.2, calculating the number n of interval frames between the current frame and the reference frame according to the following formula;

n＝n_c-n_r

in the formula, n_cIs the current frame number, n_rIs the reference frame number.

Step 1.3, calculating a maximum frame separation threshold value N according to the following formula;

N＝f×T

where f is the frame rate, T is the maximum detection time threshold, and T is 0.2 s.

Step 1.4, if N is larger than or equal to N, the current frame C is a detection frame, the current frame C is set as a new reference frame, the next frame C of the current frame is set as a new current frame, and the step 1.2 is executed;

if N is less than N, sequentially executing the step 1.5;

step 1.5, calculating a characteristic vector cosine included angle theta between a current frame and a reference frame according to the following steps;

in the formula, the reference frame feature vector R ═ R₀,r₁,...,r₆₃]The current frame feature vector C ═ C₀,c₁,...,c₆₃]。

Step 1.6, comparing theta with a set interframe difference threshold tau, wherein the value of the threshold tau is 1.5-2.2, and only detecting the current frame exceeding the threshold:

if theta is larger than tau, the current frame C is a detection frame, the current frame C is set as a new reference frame, the next frame of the current frame C is set as a new current frame, and the step 1.2 is executed;

if theta is less than tau, the current frame C is not a detection frame, the reference frame is unchanged, the next frame of the current frame C is set as a new current frame, and the step 1.2 is skipped to execute;

and 2, starting pedestrian detection after the detection condition is met, and specifically comprising the following steps:

step 2.1, cutting the detection frame into a plurality of subgraphs with equal size, and obtaining the initial detection result S of each subgraph through the pedestrian detector_iThe set S is obtained according to the following formula:

S＝(S₀∪S₁∪…∪S_n)-((S₀∩S₁)∪(S₁∩S₂)…∪(S_m-1∩S_m))

wherein m is the number of cuts, i is 0.

Step 2.2, calculating the first screening test set S according to the following formula_I；

S_I＝S-S_O

In the formula, S_OThe degree of overlap with the detection frame is larger than a threshold value y₁Set of all detection boxes of (1), take y₁＝0.7。

And 2.3, further screening the detection frames in the redundant area boundary range.

Step 2.3.1, screening frames in the redundant area according to the position coordinates of the detection frames;

step 2.3.2, calculate the set of redundant boxes S according to the following equation_R；

Wherein a and b are candidate frames, y₂、y₃To reject the threshold, take y₂＝0.8，y₃＝0.6。

Step 2.4, acquiring a pedestrian detection frame set S of the detection frame according to the following formula_P。

S_P＝S_I-S_R

And 3, tracking the pedestrian after detecting the pedestrian, wherein the method comprises the following specific steps:

step 3.1, taking the detection frame in the detection frame as a sample image block, and obtaining a training sample x ═ x by cyclic shift _i1,2, n, inputting the i ═ 1,2,. n } into a classifier for training;

step 3.2, solving DFT transformation of tracker template alpha according to the following formula

In the formula (I), the compound is shown in the specification,

is the element of the kernel function k (x, x'),

for training sample x ═ x_iA regression value y ═ { y ═ 1,2,. and n } corresponding to | i ═ 1,2_iA DFT transform of 1, 2.,. n }, λ being a regularization parameter.

Step 3.3, calculate according to the following equation

After the tracking pedestrian position is converted from the frequency domain to the time domain, the region with the largest numerical value is the position of the tracking pedestrian;

in the formula (I), the compound is shown in the specification,

is an element of the kernel function k (x, z'). Constructing a detection sample z ═ { z ] by cyclic dense sampling_i|i＝1,2,...,n}，z_i＝Pⁱz。

In step 1.6, τ is 1.8. The method is located in the middle of the NCIE stable region, is far away from the inflection point of the NCIE sudden change and has higher frame spacing, and can ensure that the detection frame number is reduced to the greatest extent on the premise of continuous video sequences, namely the lowest detection frame number when the video content is continuous, effectively reduce the detection frame number and meet the visual characteristics of human eyes.

The self-adaptive intermittent cyclic detection-tracking method for night vision anti-halation, which is designed by the invention, solves the problem of insufficient real-time performance caused by detection of each frame by adopting a cyclic mode of intermittent detection; the self-adaptive selection detection frame is adopted, the phenomenon of multi-detection or missing detection caused by adopting an intermittent detection mode is avoided, and the detection and tracking precision of the night vision anti-halation pedestrians is improved.

Compared with the prior art, the invention has the beneficial effects that:

1. the self-adaptive intermittent cyclic detection-tracking method provided by the invention aims to solve the problems of low processing speed and low precision of a pedestrian detection and tracking method in a dynamic scene of halation at night, and provides a method for triggering a detection end in a detection-tracking working mode in an intermittent cyclic mode, so that the number of detection frames is greatly reduced, and the detection speed is greatly improved;

2. the self-adaptive intermittent cyclic detection-tracking method designed by the invention adaptively selects the detection frame according to the content difference of the video sequence, solves the problems of missed detection and multiple detection caused by over-small difference due to over-large content difference of the video frame during intermittent cyclic detection-tracking, and effectively improves the detection precision.

Drawings

FIG. 1 is an adaptive intermittent cycle detection-tracking flow diagram;

FIG. 2 is a fused No. 1 detection frame image of a slow video sequence;

FIG. 3 is a 4 th inspection frame image of the merged slow video sequence;

FIG. 4 is a 6 th inspection frame image of the merged slow video sequence;

FIG. 5 is an 11 th inspection frame image of the merged slow video sequence;

FIG. 6 is a1 st detection frame image of the fused fast video sequence;

FIG. 7 is a 3 rd detected frame image of the fused fast video sequence;

FIG. 8 is a 4 th inspection frame image of the fused fast video sequence;

FIG. 9 is a 7 th inspection frame image of the fused fast video sequence;

FIG. 10 is a1 st frame image of a synchronous detection-tracking method detecting a night vision anti-blooming video sequence;

FIG. 11 is a 2 nd frame image of a synchronous detection-tracking method detecting a night vision anti-blooming video sequence;

FIG. 12 is a 3 rd frame image of a synchronous detection-tracking method detecting a night vision anti-blooming video sequence;

FIG. 13 is an intermittent loop detection-tracking method for detecting a1 st frame image of a night vision anti-blooming video sequence;

FIG. 14 is an intermittent loop detection-tracking method for detecting a 6 th frame image of a night vision anti-blooming video sequence;

FIG. 15 is a diagram of an adaptive intermittent loop detection-tracking method for detecting a1 st frame image of a night vision anti-blooming video sequence;

FIG. 16 is an adaptive intermittent loop detection-tracking method for detecting a 3 rd frame image of a night vision anti-blooming video sequence;

FIG. 17 is a diagram of an adaptive intermittent loop detection-tracking method for detecting a 4 th frame image of a night vision anti-blooming video sequence;

FIG. 18 is an adaptive intermittent loop detection-tracking method for detecting the 7 th frame image of a night vision anti-blooming video sequence.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and examples.

In order to solve the problems of low processing speed, low precision and the like of a pedestrian detection and tracking method in a dynamic scene of halation at night, the invention designs a self-adaptive intermittent cyclic detection-tracking method suitable for the night vision integration night vision anti-halation pedestrian detection and tracking. The method selects detection frames in a self-adaptive manner according to the condition of an interframe difference threshold and a maximum interframe separation threshold, wherein the interframe difference threshold is determined by the correlation of a feature vector cosine included angle threshold of two frames of images, the number of interval frames and the visual effect of the detection frames, so that the number of the detection frames can be reduced to the maximum extent on the premise of meeting the visual characteristics of human eyes, the processing efficiency of night-vision anti-halation pedestrian detection is improved, and the problems of missed detection and multiple detection caused by over-small difference due to over-large content difference of video frames during intermittent cyclic detection-tracking can be solved; the maximum frame separation threshold is introduced, the problems of untimely updating, tracking loss error and the like caused by too small content difference of video frames during detection and tracking can be solved, and the precision and fault tolerance of night vision anti-halation pedestrian detection are improved.

step 1, self-adaptive selection of detection frame

n＝n_c-n_r (1)

N＝f×T (2)

where f is the frame rate and T is the maximum detection time threshold. When video content is observed, the time from the transmission of the light signal from human eyes to brain nerves until visual residue disappears is 0.2s, and T is taken to be 0.2s to ensure the continuity of video observed by the human eyes.

Step 1.4, if N is larger than or equal to N, the current frame C is a detection frame, the current frame C is set as a new reference frame, the next frame C of the current frame is set as a new current frame, and the step 1.2 is executed; if N is less than N, sequentially executing the step 1.5;

Step 1.5.1, acquiring an RGB histogram of a visible halo image;

step 1.5.2, constructing RGB histogram feature vectors, and mapping the RGB three-dimensional vectors into one-dimensional feature vectors; the method comprises the following processing steps:

step 1.5.2.1, mapping each RGB pixel value to an integer with a value range of [0,63] according to the following formula;

index_i＝[B_i/64]×4²+[G_i/64]×4¹+[R_i/64]×4⁰,1≤i≤N (4)

in the formula, index_iRepresenting the mapping value, R, corresponding to the ith pixel point_i、G_i、B_iThe pixel value of the ith pixel point in the image is shown, and N is the total pixel point. ([ B)_i/64],[G_i/64],[R_i/64]) Is a quadruple number with three digits from low to high, and has (0,1,2,3) four color partitions.

Step 1.5.2.2 counts the number of each mapping value in the whole image, and the number of 64 mapping values forms a one-dimensional vector X, which is denoted as X ═ Num₀,Num₁,…,Num₆₃) The method not only keeps the characteristics of each color channel of the whole image, but also avoids the problem of huge calculation amount by directly using an RGB histogram;

and 1.6, comparing theta with a set interframe difference threshold tau, and detecting only the current frame exceeding the threshold.

If theta is larger than tau, the current frame C is a detection frame, the current frame C is set as a new reference frame, the next frame of the current frame C is set as a new current frame, and the step 1.2 is executed; if theta is less than tau, the current frame C is not a detection frame, the reference frame is unchanged, the next frame of the current frame is set as a new current frame, and the step 1.2 is skipped to execute;

the specific implementation method of the threshold τ is as follows:

if the threshold value of the cosine included angle of the feature vector is set to be too large, the number of the interval frames is too large, so that the observed video content is discontinuous, and if the threshold value is set to be too small, the number of the interval frames is too small, so that the inter-frame content still has redundancy. Therefore, to ensure that the detection frame number is reduced to the maximum extent on the premise that the video sequence is continuous, the key is to determine the optimal balance point between the interval detection frame number of the video sequence and the visual effect. And determining the best value of the interframe difference threshold tau by researching the relation between the threshold of the cosine included angle of the characteristic vector and the number of interval detection frames and the visual effect. The number of interval frames is measured by a frame rate, and the visual effect after frame detection is judged from the subjective and objective aspects according to the visual characteristics of human eyes and the overall correlation index of the video sequence.

On the aspect of human visual characteristics, on the premise of keeping the playing time lengths of the video sequences before and after the frame detection, if the human eyes hardly feel the difference between the two video sequences before and after the frame detection during actual playing, the video content is still continuous, and the middle undetected frame is a redundant frame; otherwise, the undetected frame contains a valid frame.

On an objective index, the overall correlation of the video sequence after frame detection is measured by nonlinear correlation information entropy NCIE. Nonlinear correlation information entropy of K video interframe nonlinear correlation quantitative measures

Comprises the following steps:

wherein the non-linear correlation matrix R^NFormula (9) non-linear joint entropy

The formula is (10):

R^N＝{NCC_ij}_{1≤i≤K,1≤j≤K} (6)

in the formula, NCC_ijRepresents a non-linear correlation coefficient between the ith frame image and the jth frame image, wherein,

is the eigenvalue of the non-linear correlation matrix.

By performing experiments on video sequences with different pedestrian motion speeds in different night-vision halation scenes, the number of interval frames is calculated under the condition that the threshold value of a cosine included angle is gradually increased, the overall correlation of the video sequences after the interval frame detection is judged by calculating the NCIE value, and the interframe difference threshold value tau is determined according to the variation trend of the overall correlation.

According to research results, the interval frame number is increased along with the increase of the threshold value of the cosine included angle of the feature vector. The frame removing rate of a video sequence with slow pedestrian movement is higher, and the frame removing rate of different videos is 62% -76% when tau is 2; the relative frame removing rate of a sequence with fast motion of a video object is low, and when tau is 2, the frame removing rate is between 30% and 38%.

The overall trend of NCIE changes is decreasing with increasing τ. When tau is less than or equal to 2, the change of the NCIE value is relatively stable, the NCIE value of the whole sequence is quite close to the NCIE value before frame detection, and human eyes cannot feel the difference between the two sequences when the NCIE is actually played; when tau is more than 2 and less than 2.5, the NCIE begins to be greatly reduced and has a larger difference with the NCIE value before frame-separating detection, and human eyes can feel the difference of two sequences during actual playing; when tau is more than or equal to 2.5, the NCIE value oscillates and changes, but the integral value is less than the NCIE value when tau is less than or equal to 2, which shows that an inflection point exists in the range of 1.5 to tau is less than or equal to 2.5, so that NCIE is mutated, the integral correlation of a video sequence is weakened, and the video content begins to generate a discontinuous phenomenon.

In summary, the value of the interframe difference threshold τ is to satisfy the requirement that NCIE is in a stable region and the frame removing rate is high. From the research results, it is known that the NCIE value changes abruptly when τ is 2.2, and the requirement of high number of interval frames is also met, and the value of the threshold τ is reasonable between [1.5 and 2.2 ]. Preferably, τ is 1.8, is located in the middle of the NCIE plateau, is far from the inflection point of the NCIE abrupt change, and has a high number of frame intervals, so as to ensure that the number of detected frames, i.e., the lowest number of detected frames when the video content is continuous, can be reduced to the maximum extent on the premise that the video sequence is continuous, and the number of detected frames can be effectively reduced and meets the visual characteristics of human eyes.

S＝(S₀∪S₁∪…∪S_n)-((S₀∩S₁)∪(S₁∩S₂)…∪(S_m-1∩S_m)) (8)

wherein m is the number of cuts, i is 0.

S_I＝S-S_O (9)

In the formula, S_OThe degree of overlap with the detection frame is larger than a threshold value y₁Set of all detection boxes. By experiment on the actual video sequence, when the overlapping degree y₁When the number of the overlapping frames is 0.7, the number of the overlapping frames is small, and the detected pedestrian is not lost.

Wherein a and b are candidate frames, y₂、y₃The culling threshold is determined. By experiment on the actual video sequence, take y₂＝0.8，y₃0.6, can accurately detect complete pedestrian and precision is higher.

S_P＝S_I-S_R (11)

In the formula (I), the compound is shown in the specification,

is the element of the kernel function k (x, x'),

Step 3.3, calculate according to the following equation

in the formula (I), the compound is shown in the specification,

A specific simulation example is given below to illustrate the present invention.

Example (b):

the embodiment builds the environment description: a visible light camera Basler ACA1280-60gc and a far infrared camera Gobi-640-GigE are adopted to simultaneously acquire visible light and infrared video under a night halation scene, the resolution ratio is 640 x 480, and image data are transmitted to an image processing platform through a gigabit network port. The processing platform adopts a portable computer, the computer processor is Intel (R) core (TM) i7-7700HQ CPU @2.80GHz, the display card is NVIDIA GeForce GTX1050, and a Windows 1064-bit operating system is used. The processing software was MATLAB2018, Visual Studio 2017 in combination with OpenCV3.4.1 library.

The main contents are as follows: a self-adaptive intermittent cycle detection-tracking method (see figure 1) is designed, a detection end in a detection-tracking working mode is triggered by self-adaptive intermittent cycles, the number of frames processed by a detection algorithm is reduced, and the problems of low processing speed, low precision and the like of a pedestrian detection and tracking method in a night halation dynamic scene are solved. The method comprises the following specific steps:

an adaptive selection detection frame

1. Extracting a video frame, and determining a1 st frame of a video sequence as a reference frame and a 2 nd frame of the video sequence as a current frame;

2. calculating the frame number n of the interval between the reference frame and the current frame according to the formula (1);

3. calculating a maximum frame separation threshold value N according to a formula (2);

4. if N is larger than or equal to N, the current frame C is a detection frame, the current frame C is set as a new reference frame, the next frame of the current frame C is set as a new current frame, and the step 2 is skipped to execute; if N is less than N, sequentially executing the step 5;

5. calculating a vector cosine included angle theta between the current frame and the reference frame according to the formula (3);

6. and determining the frame difference threshold value tau according to the formulas (5), (6) and (7) to ensure that the detection frame number is reduced to the maximum extent on the premise of continuous video sequences.

7. And comparing theta with a set threshold value tau, and detecting only the current frame exceeding the threshold value. If theta is larger than tau, the current frame C is a detection frame, the current frame C is set as a new reference frame, the next frame of the current frame C is set as a new current frame, and the step 2 is executed; if theta is less than tau, the current frame C is not the detection frame, the reference frame is not changed, the next frame of the current frame is set as the new current frame, and the step 2 is executed.

Randomly selecting two groups of video sequences under a night vision anti-halation scene for an experiment, wherein Slow video is a detection motion sequence of a Slow video, Fast video is a frame detection motion sequence of a Fast video, the original video frame rate is 25 frames/s, N is set to be 5 according to a formula (2), and the playing time lengths of the Slow video and the Fast video are respectively 14.84s and 15 s. According to the video frame detection process (see fig. 1), the original detection frame number of two groups of videos is reduced from 371, 375 to 96, 231 frames by using an adaptive intermittent cycle detection-detection tracking method. The first 4 frames of detection frames adaptively selected from the fused video sequence are listed, which correspond to the 1 st, 4 th, 6 th and 11 th frames of Slow video of the original video (see fig. 2-5) and the 1 st, 3 th, 4 th and 7 th frames of Fast video of the original video (see fig. 6-9), and the experimental results of the intermittent cycle detection-detection tracking method and the adaptive intermittent cycle detection-detection tracking method of the present invention are shown in table 1 and table 2.

Table 1 sequential detection frames selected in Slow video

TABLE 2 consecutive detection frames selected in Fast video

As can be seen from the frame detection results in tables 1 and 2, in Slow video, when the number of original detection frames is 11 frames, only 1, 4, 6 and 11 four frames need to be detected by adopting a self-adaptive intermittent cycle detection method, wherein the 1, 4 and 6 frames meet (N < N) # and (theta is more than or equal to tau), the self-adaptive interval time at the moment is respectively 0.12s and 0.08s, and the 6 th frame and the 11 th frame meet N is more than or equal to N, so that the 11 th frame needs to be detected, and the self-adaptive interval time is 0.2 s; the frames selected by intermittent cycle detection are 1, 6 and 11 frames, and the intermittent interval time is fixed to 0.2 s. In Fast video, when the number of original detection frames is 7 frames, selecting 1, 3, 4 and 7 frames by adopting a self-adaptive intermittent cycle detection method, wherein the 1, 3, 4 and 7 frames all meet (N < N) # and (theta is more than or equal to tau), and the corresponding self-adaptive interval detection time is 0.08s, 0.04s and 0.12 s; the detection frames selected by intermittent cycle detection are 1 and 6 frames, and the interval time is 0.2 s. The self-adaptive intermittent cyclic detection-tracking method can be used for improving the real-time performance of processing, avoiding missing detection caused by intermittent cyclic detection and improving the speed and the precision of the heterogeneous video fusion night vision anti-halation pedestrian detection and tracking.

Two, video frame detection-tracking

1. Cutting the detection frame into a plurality of subgraphs with equal sizes, and calculating a set S according to a formula (8) by using the detection result of each subgraph;

2. obtaining a first-screened detection set S according to a formula (9)_I；

3. Calculating a redundant box set S according to equation (10)_R。

4. Pedestrian detection frame set S for obtaining detection frames according to formula (11)_P；

5. Solving the DFT transform of the tracker template α according to equation (12)

6. Determining the position of the tracked pedestrian according to the formula (13);

7. the next frame image is tracked.

Randomly selecting a group of video sequences under a night vision anti-halation scene, wherein the original video frame rate is 25 frames/s, the original frame number is 375, the playing time length is 15s, and processing the night vision anti-halation video by using three methods of synchronous detection-tracking, intermittent cyclic detection-tracking and self-adaptive intermittent cyclic detection-tracking respectively, wherein the frame number of the detection-tracking detection is 375, the frame number of the intermittent cyclic detection is 62, and the frame number of the self-adaptive intermittent cyclic detection is 231. Three methods are listed here to select the first few consecutive frames of detection, the synchronous detection-tracking method detects the 1 st, 2 nd, 3 rd frames of the night vision anti-blooming video sequence (see fig. 10-12), the intermittent cyclic detection-tracking detects the 1 st, 6 th frames of the night vision anti-blooming video sequence (see fig. 13-14), and the adaptive intermittent cyclic detection-tracking method detects the 1 st, 3 rd, 4 th, 7 th frames of the night vision anti-blooming video sequence (see fig. 15-18). Table 3 gives the accuracy and average detection time for the three detection-tracking methods.

TABLE 3 comparison of the results of the three methods in night vision anti-halation video

As can be seen from table 3, the detection speed of the intermittent cyclic detection-tracking method is increased by more than 3 times compared with the synchronous detection-tracking method, so that the problems of low processing speed, video blockage and the like of the synchronous detection-tracking method are solved, but the problem of missing detection caused by fixed interval detection also exists, so that the accuracy of pedestrian detection is reduced to some extent; compared with the intermittent cycle, the accuracy of the self-adaptive intermittent cycle detection-tracking method is improved by 9.47%, the detection speed is reduced from 50FPS to 30FPS, and the real-time requirement of pedestrian detection and tracking is still met. From the comprehensive performance, the self-adaptive intermittent cyclic detection-tracking method improves the night-vision anti-halation pedestrian detection and tracking accuracy while meeting the real-time requirement, is superior to the synchronous detection-tracking and intermittent cyclic detection-tracking methods, and is more suitable for pedestrian detection and tracking under the dynamic background of halation at night.

The present invention has been described in terms of specific examples, which are provided to aid understanding of the invention and are not intended to be limiting. For a person skilled in the art to which the invention pertains, several simple deductions, modifications or substitutions may be made according to the idea of the invention.

Claims

1. A night vision anti-halation pedestrian detection and tracking method based on heterogeneous video fusion comprises the following steps:

n＝n_c-n_r

in the formula, n_cIs the current frame number, n_rIs a reference frame number;

N＝f×T

in the formula, f is a frame rate, T is a maximum detection time threshold, and T is 0.2 s;

if N is less than N, sequentially executing the step 1.5;

where the reference frame feature vector R is R,₀r₁,...,r₆₃]the current frame feature vector C ═ C₀,c₁,...,c₆₃]；

wherein m is the number of times of cutting, i is 0.

S_I＝S-S_O

In the formula, S_OThe degree of overlap with the detection frame is larger than a threshold value y₁Set of all detection boxes of (1), take y₁＝0.7；

2.3, further screening the detection frames in the redundant area boundary range;

Wherein a and b are candidate frames, y₂、y₃To reject the threshold, take y₂＝0.8，y₃＝0.6；

Step 2.4, acquiring a pedestrian detection frame set S of the detection frame according to the following formula_P；

S_P＝S_I-S_R

step 3.1, taking the detection frame in the detection frame as a sample image block, and obtaining a training sample x ═ x by cyclic shift_i1,2, n, inputting the i ═ 1,2,. n } into a classifier for training;

In the formula (I), the compound is shown in the specification,

is the element of the kernel function k (x, x'),

for training sample x ═ x_iA regression value y ═ { y ═ 1,2,. and n } corresponding to | i ═ 1,2_iDFT transform of 1,2, ·, n, λ is a regularization parameter;

step 3.3, calculate according to the following equation

in the formula (I), the compound is shown in the specification,

is an element of a kernel function k (x, z'); constructing a detection sample z ═ { z ] by cyclic dense sampling_i|i＝1,2,...,n}，z_i＝Pⁱz。

2. The method for night vision anti-halation pedestrian detection and tracking fused with heterologous video according to claim 1, characterized in that: in step 1.6, τ is 1.8.