CN112396628A - Anti-track-jitter method based on target height prediction - Google Patents
Anti-track-jitter method based on target height prediction Download PDFInfo
- Publication number
- CN112396628A CN112396628A CN201910746785.XA CN201910746785A CN112396628A CN 112396628 A CN112396628 A CN 112396628A CN 201910746785 A CN201910746785 A CN 201910746785A CN 112396628 A CN112396628 A CN 112396628A
- Authority
- CN
- China
- Prior art keywords
- track
- target
- frame
- height
- boundary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000001514 detection method Methods 0.000 claims abstract description 12
- 230000006870 function Effects 0.000 claims description 16
- 230000002159 abnormal effect Effects 0.000 claims description 5
- 238000013135 deep learning Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 abstract description 4
- 230000008569 process Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/66—Analysis of geometric attributes of image moments or centre of gravity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Geometry (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a track jitter resistance method based on target height prediction, which comprises the steps of removing track data caused by shielding under the condition of target detection, fitting the relation between each track and height for the rest track data, calculating the height of a new boundary frame by using an equation, estimating the position of the midpoint of a lower boundary frame by using the heights of an upper boundary frame and the new boundary frame to form a new track, and performing track analysis by using the new track.
Description
Technical Field
The invention mainly relates to the field of image processing, in particular to target track analysis, and more particularly relates to a target height prediction-based anti-track-jitter method.
Background
The target tracking technology is always one of the hot spots in the field of computer vision research, and is widely applied to the aspects of intelligent automobiles, video monitoring, human-computer interaction and the like at present. Most target tracking methods perform real-time target identification and tracking on a target selected from a first frame of a short video image sequence. Since the video analysis is performed by blocking each other, the video analysis is very difficult.
Currently, popular target Tracking algorithms include a Structured output Tracking with Kernel (Struck) algorithm based on coring, a Tracking-Learning-Detection (TLD) algorithm, a Multiple Instance Learning (MIL) algorithm, a Kernel Correlation Filter (KCF) Tracking algorithm, and the like. The KCF tracking algorithm receives most attention due to its excellent performance in speed and accuracy, but the kernel correlation filter algorithm has a significant drawback in anti-occlusion performance. In the target tracking process, the target itself or the background is greatly changed, such as partial or full shielding of the target, appearance change of the target, external illumination and the like, which can greatly affect the tracking effect. The complexity of the algorithms is too high, and optimization and reduction of the complexity of the algorithms are always difficult to solve.
Due to the fact that the target is in the process of moving, the target is shielded, and particularly when people are many and the target is at the boundary in the video, the track of the target in the video sequence is greatly jittered. In a newly retail video monitoring scene, a person-type track in a video is mainly used for judging whether a person enters or exits a store, if the person-type track is blocked, the track can suddenly shake due to the fact that a boundary frame of the person-type track can suddenly change, and when the person enters or exits the store, the shaking track can cause the calculation of the entering or exiting store to be inaccurate. Aiming at the problem, the invention provides a method for detecting the target by using a deep learning model, tracking by calculating the intersection ratio of target boundary frames of adjacent frames, determining the relative position of the boundary frames in a video image, and fitting the human form height by using the data of a track and a regression method.
Disclosure of Invention
The invention aims to provide a method for solving the track jitter phenomenon when the target video is detected to have occlusion or be at a boundary in the target tracking technology.
The specific technical scheme of the invention is as follows:
a method for resisting track jitter based on target prediction, the method comprising the steps of:
(1) training a deep learning network model, and detecting a target by using the model;
(2) establishing a plane rectangular coordinate system for a detected target, establishing an X axis and a Y axis, establishing a target boundary box, wherein w represents the width of the target boundary box, and H represents the height of the target boundary box;
(3) performing target detection on each frame of picture in the video by using the model to obtain the (x, y) coordinates of the central point of the boundary frame of each frame of picture and the height h of the boundary frame;
(4) converting the coordinates (x, y) of the central point of the boundary frame of each frame of the picture into the coordinates (x) of the central point of the upper boundary by using a function formula1,y1) Coordinates (x) of the midpoint of the boundary on each frame of the picture1,y1) Sequentially connecting into a track;
(5) and (4) removing the height abnormal value of the track in the step (4), and fitting the rest of numerical values to x1,y1A functional relationship with h;
(6) and (5) recalculating the new height h of the bounding box of each frame of picture in the track by using the functional relation obtained in the step (5)1;
(7) Utilizing the new height h in step (6)1And the coordinates (x) of the middle point of the upper bounding box1,y1) The function relation between the numerical value of (a) and the numerical value of the central point coordinate of the lower boundary frame of the image, and the central point coordinate (x) of the lower boundary frame of each frame of the image in the track is calculated2,y2) The middle points of the lower boundary frames of each frame of picture are sequentially connected into a new track;
(8) taking the new track formed in the step (7) as a target detection track;
and based on the anti-track-jitter method of the steps, target tracking is carried out by calculating the intersection ratio of the target boundary box so as to realize multi-target detection.
Further, the rectangular plane coordinate system in step (2) takes the upper left corner of the detected picture as the origin of coordinates, the length direction as the X axis, and the width direction as the Y axis.
Further, the function formula in step (4) is directly obtained by using the geometric characteristics of the bounding box, and the specific formula is as follows: x is the number of1 = x,y1 = y-1/2h。
Further, the functional relationship in the step (5) is to utilize x1,y1And h is a fitting function formula obtained by the data relation, and the specific formula is as follows: h = a y1+ b, where a, b are regression coefficients.
Further, the regression coefficients a and b are calculated according to the trajectory data to obtain specific values.
Further, the functional relationship in step (7) is a functional formula directly obtained by using the geometric characteristics of the bounding box, and the specific formula is as follows: x is the number of2 = x1, y2 = y1 + h1。
Further, the specific formula of the intersection ratio is IOU=(A∩B)/(A∪B), for the boundary frame of the previous frame, the intersection ratio of the boundary frame with all the boundary frames of the next frame is calculated, the boundary frame with the largest intersection ratio is found, and if the intersection ratio is greater than the threshold value, the same person is considered.
Further, the letter in the above step, h represents the initial output height of the model to the bounding box, h1Representing the predicted height of the bounding box calculated using the fitting function of step (5).
Has the advantages that: 1. Compared with the traditional algorithm, the algorithm does not need to calibrate the camera in the aspect of height prediction. 2. By regressing the human-type altitude and then calculating a new trajectory, the trajectory thus calculated is robust against jitter. 3. The calculation complexity is low through a regression mode, and the calculation process is simplified.
Drawings
FIG. 1 is a coordinate system and bounding box according to the present invention.
FIG. 2 is a schematic diagram of the cross-over ratio mentioned in the present method.
FIG. 3 shows the situation that the detected target is blocked by an obstacle according to the present invention.
FIG. 4 is a flow chart of the method.
Wherein: 1. the system comprises a plane rectangular coordinate system 2, a boundary frame 201, a boundary frame center point 202, an upper boundary frame center point 203, a lower boundary frame center point 3, an obstacle A, a set I B, a set II C and an intersection.
Detailed Description
The method in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings.
As shown in fig. 1 to 4, a method for resisting track jitter based on target prediction includes the following steps:
(1) training a deep learning network model, and detecting a target by using the model;
(2) the method comprises the steps of establishing a planar rectangular coordinate system 1 for a detected target, establishing an X axis and a Y axis, establishing a target boundary frame 2, wherein w represents the width of the target boundary frame 2, H represents the height of the target boundary frame 2, and the planar rectangular coordinate system 1 takes the upper left corner of a detected picture as a coordinate origin O, the length direction as the X axis and the width direction as the Y axis.
(3) Since the video is in a form of video stream, when the picture is read by using the model, the target detection is performed on each frame of picture in the video, and the (x, y) coordinate of the central point 201 of the bounding box of each frame of picture, the height h and the width W of the bounding box 2, and the probability p of predicting the target are obtained, wherein the larger the value is, the higher the probability of identifying the target is.
(4) A function formula directly obtained by using the geometric characteristics of the bounding box 2: x is the number of1 = x,y1= y-1/2h, converting the obtained coordinate (x, y) of the central point 201 of the picture bounding box of each frame to the coordinate (x) of the central point 202 of the upper bounding box1,y1) Coordinates (x) of the point 202 in the bounding box on each frame of the picture1,y1) Sequentially connecting into a track;
(5) the method described for the step (4)Removing the height abnormal value, fitting the rest values to x1,y1Functional relation with h, in particular, by x1,y1And h is a fitting function formula obtained by the data relation, and the specific formula is as follows: h = a y1+ b, wherein a and b are regression coefficients, and the regression coefficients a and b are calculated according to the track data to obtain specific values.
(6) And (5) recalculating the new height h of the bounding box 2 of each frame of picture in the track by using the functional relation obtained in the step (5)1。
(7) Obtaining the new height h in step (6) by using the geometrical characteristics of the bounding box 21And the coordinates (x) of the point 202 in the upper bounding box1,y1) The value of (d) is a function of the value of the coordinates of the point 203 in the lower bounding box of the picture: x is the number of2 = x1, y2 = y1 + h1The coordinates (x) of the center 203 of the lower bounding box of each frame of picture in the track are calculated2,y2) The newly generated coordinates of the center point 203 of the lower bounding box are all predicted from the coordinates of the upper bounding box 202, so that a new trajectory can be calculated for a video sequence.
(8) And (5) taking the new track formed in the step (7) as a target detection track.
And based on the anti-track-jitter method of the steps, target tracking is carried out by calculating the intersection ratio of the target boundary frame 2 to realize multi-target detection. Specifically, the specific formula of the intersection ratio is IOU=(A∩B)/(A∪B), for the boundary frame 2 of the previous frame, the intersection ratio of the boundary frame 2 of the previous frame to all the boundary frames 2 of the next frame is calculated, the boundary frame with the largest intersection ratio is found, and if the intersection ratio is greater than the threshold value, the same person is considered.
As shown in fig. 2, a is a set one representing the picture set of the previous frame of the bounding box 2, B is a set two representing the picture set of the next frame of the bounding box 2, an intersection C of a and B represents a coincidence portion of the previous frame of the picture set and the next frame of the picture set, and a larger intersection C means a higher coincidence portion of the previous frame of the picture and the next frame of the picture, that is, a higher probability that the previous frame of the picture and the next frame of the picture are the same detection target. In the limiting example, the first set a and the second set B completely coincide with each other, when the area of the union of the first set a and the second set B is the same as the area of the intersection C, when the intersection ratio is 1, it means that the previous frame picture and the next frame picture completely coincide with each other.
As shown in fig. 3, since the model outputs coordinates of a center point 201 of the bounding box, when a detected object is at the boundary of the video or when two objects are mutually occluded, the bounding box 2 detects an exposed human type 204 which is not occluded, and a hidden human type 205 which is partially occluded by the obstacle 3 is not detected by the bounding box 2, so that a complete human type cannot be detected. At this time, the coordinates of the center point 201 of the bounding box output by the model are not the coordinates of the center point of the real human model, in which case the coordinate data output by the model is the height abnormal value in the step (5), and the trajectory jitter occurs in the data processing of the target tracking. When the method of the invention is used, the height abnormal value is removed when the fitting function process of the step (5) is carried out, then the fitting function formed by other normal values is used, and the function of the detected target calculates the predicted height and the predicted coordinate value of the middle point 203 of the lower boundary frame under the condition that the shielding of the obstacle 3 is obtained according to the coordinate data value of the center 202 of the upper boundary frame under the shielding condition. The method enables the track coordinates of the detected target to be stably and accurately predicted when the detected target video is blocked or at the boundary.
The above embodiments are only preferred embodiments of the present invention, and are not intended to limit the technical solutions of the present invention, so long as the technical solutions can be realized on the basis of the above embodiments without creative efforts, which should be considered to fall within the protection scope of the patent of the present invention.
Claims (7)
1. An anti-track-jitter method based on target height prediction, the method comprising the steps of:
(1) training a deep learning net model, and detecting a target by using the model;
(2) establishing a plane rectangular coordinate system for a detected target, establishing an X axis and a Y axis, establishing a target boundary box, wherein w represents the width of the target boundary box, and H represents the height of the target boundary box;
(3) performing target detection on each frame of picture in the video by using the model to obtain the (x, y) coordinates of the central point of the boundary frame of each frame of picture and the height h of the boundary frame;
(4) converting the coordinate (x, y) of the central point of the boundary frame of each frame of the picture into the coordinate (x) of the central point of the upper boundary frame by using a function formula1,y1) Coordinates (x) of the middle point of the bounding box on each frame of picture1,y1) Sequentially connecting into a track;
(5) and (4) removing the height abnormal value of the track in the step (4), and fitting the rest of numerical values to x1,y1A functional relationship with h;
(6) and (5) recalculating the new height h of the bounding box of each frame of picture in the track by using the functional relation obtained in the step (5)1;
(7) Utilizing the new height h in step (6)1And the coordinates (x) of the middle point of the upper bounding box1,y1) The function relation between the numerical value of (a) and the numerical value of the central point coordinate of the lower boundary frame of the image, and the central point coordinate (x) of the lower boundary frame of each frame of the image in the track is calculated2,y2) The middle points of the lower boundary frames of each frame of picture are sequentially connected into a new track;
(8) taking the new track formed in the step (7) as a target detection track;
and based on the anti-track-jitter method of the steps, target tracking is carried out by calculating the intersection ratio of the target boundary box so as to realize multi-target detection.
2. The method as claimed in claim 1, wherein the rectangular plane coordinate system in step (2) has an upper left corner of the detected picture as an origin, a length direction as an X-axis, and a width direction as a Y-axis.
3. The method for resisting track-shaking based on target height prediction as claimed in claim 1, wherein the function formula in step (4) is a function formula directly obtained by using the geometric characteristics of the bounding box, and the specific formula is as follows: x is the number of1 = x,y1 = y-1/2h。
4. The method for resisting track-shaking based on target height prediction as claimed in claim 1, wherein said functional relationship in step (5) is using x1,y1And h is a fitting function formula obtained by the data relation, and the specific formula is as follows: h = a y1+ b, where a, b are regression coefficients.
5. The method of claim 4, wherein the regression coefficients a and b are calculated to obtain specific values according to trajectory data.
6. The method for resisting track-shaking based on target height prediction as claimed in claim 1, wherein the functional relationship in step (7) is a functional formula directly obtained by using the geometric characteristics of the bounding box, and the specific formula is as follows: x is the number of2 = x1, y2 = y1 + h1。
7. The target-height-prediction-based anti-trajectory-jitter method of claim 1, wherein the intersection ratio is expressed by IOU = (A ≡ B)/(A $ B), and for the bounding box of the previous frame, the intersection ratio with all bounding boxes of the next frame is calculated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910746785.XA CN112396628B (en) | 2019-08-14 | 2019-08-14 | Track jitter resistance method based on target height prediction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910746785.XA CN112396628B (en) | 2019-08-14 | 2019-08-14 | Track jitter resistance method based on target height prediction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112396628A true CN112396628A (en) | 2021-02-23 |
CN112396628B CN112396628B (en) | 2024-05-17 |
Family
ID=74602632
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910746785.XA Active CN112396628B (en) | 2019-08-14 | 2019-08-14 | Track jitter resistance method based on target height prediction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112396628B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114119384A (en) * | 2021-09-27 | 2022-03-01 | 青岛知能知造数据科技有限公司 | Method and system for removing jitter and calibrating edges of automobile longitudinal beam images |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109117794A (en) * | 2018-08-16 | 2019-01-01 | 广东工业大学 | A kind of moving target behavior tracking method, apparatus, equipment and readable storage medium storing program for executing |
WO2019006632A1 (en) * | 2017-07-04 | 2019-01-10 | 深圳大学 | Video multi-target tracking method and device |
-
2019
- 2019-08-14 CN CN201910746785.XA patent/CN112396628B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019006632A1 (en) * | 2017-07-04 | 2019-01-10 | 深圳大学 | Video multi-target tracking method and device |
CN109117794A (en) * | 2018-08-16 | 2019-01-01 | 广东工业大学 | A kind of moving target behavior tracking method, apparatus, equipment and readable storage medium storing program for executing |
Non-Patent Citations (1)
Title |
---|
李涛;黄仁杰;李冬梅;赵雪专;焦朋伟;: "基于线性拟合的多运动目标跟踪算法", 西南师范大学学报(自然科学版), no. 05, pages 46 - 47 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114119384A (en) * | 2021-09-27 | 2022-03-01 | 青岛知能知造数据科技有限公司 | Method and system for removing jitter and calibrating edges of automobile longitudinal beam images |
Also Published As
Publication number | Publication date |
---|---|
CN112396628B (en) | 2024-05-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP2801078B1 (en) | Context aware moving object detection | |
CN113284168A (en) | Target tracking method and device, electronic equipment and storage medium | |
CN113011367A (en) | Abnormal behavior analysis method based on target track | |
CN112102409B (en) | Target detection method, device, equipment and storage medium | |
CN113723190A (en) | Multi-target tracking method for synchronous moving target | |
CN105046719B (en) | A kind of video frequency monitoring method and system | |
CN101470809A (en) | Moving object detection method based on expansion mixed gauss model | |
JP7255173B2 (en) | Human detection device and human detection method | |
CN105469054B (en) | The model building method of normal behaviour and the detection method of abnormal behaviour | |
WO2022199360A1 (en) | Moving object positioning method and apparatus, electronic device, and storage medium | |
JP2009048240A (en) | Detection method, detection device, monitoring method, and monitoring system of moving object in moving image | |
CN111798486B (en) | Multi-view human motion capture method based on human motion prediction | |
CN108830204A (en) | The method for detecting abnormality in the monitor video of target | |
JP2020109644A (en) | Fall detection method, fall detection apparatus, and electronic device | |
CN111144465A (en) | Multi-scene-oriented smoke detection algorithm and electronic equipment applying same | |
Buemi et al. | Efficient fire detection using fuzzy logic | |
CN111986229A (en) | Video target detection method, device and computer system | |
CN112396628A (en) | Anti-track-jitter method based on target height prediction | |
CN113920585A (en) | Behavior recognition method and device, equipment and storage medium | |
CN113657250A (en) | Flame detection method and system based on monitoring video | |
JP7501747B2 (en) | Information processing device, control method, and program | |
JP2007510994A (en) | Object tracking in video images | |
KR20050052657A (en) | Vision-based humanbeing detection method and apparatus | |
WO2022091577A1 (en) | Information processing device and information processing method | |
CN112016537B (en) | Comprehensive mouse detection method based on computer vision |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |