CN112396628B - Track jitter resistance method based on target height prediction - Google Patents
Track jitter resistance method based on target height prediction Download PDFInfo
- Publication number
- CN112396628B CN112396628B CN201910746785.XA CN201910746785A CN112396628B CN 112396628 B CN112396628 B CN 112396628B CN 201910746785 A CN201910746785 A CN 201910746785A CN 112396628 B CN112396628 B CN 112396628B
- Authority
- CN
- China
- Prior art keywords
- track
- frame
- target
- height
- picture
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000001514 detection method Methods 0.000 claims abstract description 19
- 230000006870 function Effects 0.000 claims description 13
- 230000002159 abnormal effect Effects 0.000 claims description 3
- 238000013135 deep learning Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 abstract description 5
- 230000008569 process Effects 0.000 abstract description 5
- 238000004458 analytical method Methods 0.000 abstract description 3
- 230000008859 change Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000004888 barrier function Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/66—Analysis of geometric attributes of image moments or centre of gravity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Geometry (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a track jitter resistance method based on target height prediction, which is characterized in that track data caused by shielding are removed under the condition of target detection, the relation between each track and the height is fitted to the rest track data, then a new boundary frame height is calculated by utilizing an equation, the position of the middle point of a lower boundary frame is estimated by utilizing the upper boundary frame and the new boundary frame height, a new track is formed, track analysis is carried out by utilizing the new track, compared with the traditional algorithm in the aspect of predicting the height, the algorithm does not need to calibrate a camera, regression is carried out on the human height, then the new track is calculated, the calculated track jitter resistance is strong, the calculation complexity is low by adopting a regression mode, and the calculation process is simplified.
Description
Technical Field
The invention mainly relates to the field of image processing, in particular to target track analysis, and more particularly relates to an anti-track shaking method based on target height prediction.
Background
The target tracking technology is one of hot spots in the field of computer vision research, and is widely applied to aspects of intelligent automobiles, video monitoring, human-computer interaction and the like at present. Most target tracking methods are used for carrying out real-time target recognition and tracking on targets selected by the first frame of the short video image sequence. Because of the mutual occlusion from person to person in video analysis, it is very difficult to analyze video.
Currently popular target Tracking algorithms are the structured output (Structured output TRACKING WITH KERNEL, struct) based on coring, tracking-Learning-Detection (TLD), multiple instance Learning (Multiple INSTANCE LEARNING, MIL), kernel correlation filtering (Kernel Correlation Filter, KCF) Tracking algorithms, etc. Most attention is paid to the KCF tracking algorithm due to the excellent performance of the KCF tracking algorithm in terms of speed and accuracy, but the kernel correlation filter algorithm has obvious defects in terms of anti-shielding performance. In the process of tracking the target, the target itself or the background is greatly changed, such as partial or complete shielding of the target, appearance change of the target, external illumination and the like, which can greatly influence the tracking effect. The complexity of these algorithms is too high and optimization reduces the complexity of the algorithms has been an problematic issue.
Because the object has shielding condition in the process of moving, especially when the number of people is large and the boundary in the video is present, the track of the object in the video sequence has larger jitter. In a new retail video monitoring scene, judging that people enter and exit a store mainly depend on a human-shaped track in a video, if the people enter and exit the store under the shielding condition, the track can suddenly shake due to sudden change of a human-shaped boundary box, and the shake track can cause inaccurate calculation of the entering and exit of the store when the entering and exit of the store is calculated. Aiming at the problem, the invention provides a method for detecting the target by using a deep learning model, tracking by calculating the intersection ratio of the target boundary frames of adjacent frames, determining the relative position of the boundary frames in the video image, fitting the human height by using the data of the track and using a regression method.
Disclosure of Invention
The invention aims to provide a method for solving the track jitter phenomenon when a detected target video is blocked or at a boundary in the target tracking technology.
The specific technical scheme of the invention is as follows:
a method for track jitter resistance based on target prediction, the method comprising the steps of:
(1) Training a deep learning network model, and performing target detection by using the model;
(2) Establishing a plane rectangular coordinate system aiming at a detection target, establishing an X axis and a Y axis, establishing a target boundary frame, wherein w represents the width of the target boundary frame, and H represents the height of the target boundary frame;
(3) Performing target detection on each frame of picture in the video by using the model to obtain the (x, y) coordinates of the boundary frame center point of each frame of picture and the height h of the boundary frame;
(4) Converting the obtained coordinates (x, y) of the boundary frame center point of each frame of picture into coordinates (x 1,y1) of the upper boundary midpoint by utilizing a function formula, and sequentially connecting the coordinates (x 1,y1) of the upper boundary midpoint of each frame of picture into a track;
(5) Removing the height outliers for the track in the step (4), and fitting the rest numerical values to a functional relation between x 1,y1 and h;
(6) Calculating new height h 1 of the boundary frame of each frame of picture in the track again by utilizing the functional relation obtained in the step (5);
(7) Calculating the coordinates (x 2,y2) of the middle point of the lower boundary frame of each frame of picture in the track by utilizing the new height h 1 in the step (6) and the functional relation between the numerical value of the coordinates (x 1,y1) of the middle point of the upper boundary frame and the numerical value of the coordinates of the middle point of the lower boundary frame of the picture, wherein the middle point of the lower boundary frame of each frame of picture is sequentially connected into a new track;
(8) Taking the new track formed in the step (7) as a target detection track;
the track jitter resisting method based on the steps realizes multi-target detection by calculating the intersection ratio of the target bounding boxes and tracking targets.
Further, in the plane rectangular coordinate system in the step (2), the upper left corner of the detected picture is taken as the origin of coordinates, the length direction is taken as the X axis, and the width direction is taken as the Y axis.
Further, the function formula in the step (4) is a function formula directly obtained by utilizing the geometric characteristics of the bounding box, and the specific formula is as follows: x 1 = x,y1 = y-1/2h.
Further, the functional relationship in the step (5) is a fitting functional formula obtained by using the data relationship of x 1,y1, h, and the specific formula is: h=a×y 1 +b, where a, b are regression coefficients.
Further, the regression coefficients a and b are calculated to obtain specific values according to the track data.
Further, the functional relation in the step (7) is a functional formula directly obtained by utilizing the geometric characteristics of the bounding box, and the specific formula is as follows: x 2 = x1, y2 = y1 + h1.
Further, the specific formula of the intersection ratio is IOU= (A n B)/(A u B), for the boundary frame of the previous frame, the intersection ratio of the boundary frame with all boundary frames of the next frame is calculated, the boundary frame with the largest intersection ratio is found, and the boundary frame with the largest intersection ratio is considered to be the same person if the intersection ratio is larger than the threshold value.
Further, in the above step, h represents the initial output height of the model to the bounding box, and h 1 represents the predicted height of the bounding box calculated by using the fitting function of step (5).
The beneficial effects are that: 1. compared with the traditional algorithm in the aspect of predicting the height, the algorithm does not need to calibrate a camera. 2. By regressing the human height and then calculating a new track, the calculated track has strong jitter resistance. 3. The calculation complexity is low by a regression mode, and the calculation process is simplified.
Drawings
FIG. 1 is a coordinate system and bounding box of the present invention.
Fig. 2 is a schematic diagram of the cross-ratios mentioned in the present method.
Fig. 3 shows a case where the detected object is shielded by an obstacle according to the present invention.
Fig. 4 is a flow chart of the method.
Wherein: 1. plane rectangular coordinate system 2, bounding box 201, bounding box center 202, upper bounding box midpoint 203, lower bounding box midpoint 3, obstacle a, set one B, set two C, intersection.
Detailed Description
The method according to the embodiment of the present invention will be clearly and completely described in the following with reference to the accompanying drawings.
As shown in fig. 1 to 4, a method for resisting track jitter based on target prediction, the method comprising the steps of:
(1) Training a deep learning network model, and performing target detection by using the model;
(2) A plane rectangular coordinate system 1 is established for a detected target, an X axis and a Y axis are established, a set target boundary box 2,w represents the width of the target boundary box 2, and H represents the height of the target boundary box 2, wherein the plane rectangular coordinate system 1 takes the upper left corner of the detected picture as a coordinate origin O, takes the length direction as the X axis, and takes the width direction as the Y axis.
(3) Since the video is in the form of a video stream, when the pictures are read by using the model, the object detection is performed on each frame of picture in the video, so as to obtain the (x, y) coordinates of the boundary center point 201 of each frame of picture, the height h and the width W of the boundary frame 2, and the probability p of predicting the object, wherein the larger the value is, the larger the probability of identifying the object is.
(4) A function formula directly obtained by using the geometric characteristics of the bounding box 2: x 1 = x,y1 = y-1/2h, converting the obtained coordinates (x, y) of the center point 201 of the boundary frame of each frame of picture to the coordinates (x 1,y1) of the midpoint 202 of the upper boundary frame, and sequentially connecting the coordinates (x 1,y1) of the midpoint 202 of the upper boundary frame of each frame of picture into a track;
(5) And (3) removing the height abnormal value from the track in the step (4), fitting the rest numerical values to obtain a functional relation between x 1,y1 and h, and specifically, obtaining a fitting function formula by using the data relation of x 1,y1 and h, wherein the specific formula is as follows: h=a×y 1 +b, where a, b are regression coefficients, and the regression coefficients a, b calculate specific values according to the trajectory data.
(6) And (3) recalculating the new height h 1 of the bounding box 2 of each frame of picture in the track by using the functional relation obtained in the step (5).
(7) The geometrical properties of the bounding box 2 are used to derive the new height h 1 in step (6) and the value of the coordinates (x 1,y1) of the point 202 in the upper bounding box as a function of the value of the coordinates of the point 203 in the lower bounding box of the picture: x 2 = x1, y2 = y1 + h1, calculating the value of the coordinates (x 2,y2) of the midpoint 203 of the lower bounding box of each frame of picture in the track, and the newly generated coordinates of the midpoint 203 of the lower bounding box are all obtained by the coordinate prediction of the upper bounding box 202, so that a new track can be calculated for a video sequence.
(8) And (3) taking the new track formed in the step (7) as a target detection track.
Based on the track jitter resisting method, the multi-target detection is realized by calculating the intersection ratio of the target boundary boxes 2 and performing target tracking. Specifically, the specific formula of the intersection ratio is IOU= (A n B)/(A u B), for the boundary frame 2 of the previous frame, the intersection ratio of the boundary frame 2 with all boundary frames 2 of the next frame is calculated, the boundary frame with the largest intersection ratio is found, and the boundary frame with the largest intersection ratio is considered to be the same person if the intersection ratio is larger than the threshold value.
As shown in fig. 2, a is a first set, representing a picture set of a previous frame boundary frame 2, B is a second set, representing a picture set of a next frame boundary frame 2, and an intersection C of a and B represents a overlapping portion of the previous frame picture set and the next frame picture set, wherein a larger intersection C means a higher overlapping portion of the previous frame picture and the next frame picture, that is, a higher probability of representing that the previous frame picture and the next frame picture are the same detection target. In the limit example, the first set A and the second set B are completely overlapped, the area of the union of the first set A and the second set B is the same as the area of the intersection C, and the intersection ratio is 1, which means that the previous frame of picture and the next frame of picture are completely overlapped.
As shown in fig. 3, since the coordinates of the center point 201 of the bounding box are outputted by the model, when the detected object is at the boundary of the video or when two objects are blocked from each other, the exposed humanoid form 204 which is not blocked is detected by the bounding box 2, and the hidden humanoid form 205 which is partially blocked by the obstacle 3 is not detected by the bounding box 2, so that the complete humanoid form cannot be detected. At this time, the coordinates of the center point 201 of the bounding box output by the model are not the coordinates of the center point of the real person, in which case the coordinate data output by the model is the height outlier in the above step (5), and track jitter occurs in the data processing of the target tracking. When the method is utilized, the height abnormal value is required to be removed when the fitting function process of the step (5) is carried out, then the fitting function is formed by utilizing other normal values, and the predicted coordinate values of the height and the midpoint 203 of the lower boundary frame are calculated according to the function of the detected object under the condition that the coordinate data value of the center 202 of the upper boundary frame under the shielding condition is shielded by the barrier 3. The method enables the track coordinates of the detection target to be predicted more stably and accurately when the detection target video is blocked or is at the boundary.
The above embodiments are only preferred embodiments of the present invention, and are not limiting to the technical solutions of the present invention, and any technical solution that can be implemented on the basis of the above embodiments without inventive effort should be considered as falling within the scope of protection of the patent claims of the present invention.
Claims (4)
1. A method for resisting track jitter based on target height prediction, the method comprising the steps of:
(1) Training a deep learning net model, and carrying out target detection by using the model;
(2) Establishing a plane rectangular coordinate system aiming at a detection target, establishing an X axis and a Y axis, establishing a target boundary frame, wherein w represents the width of the target boundary frame, and H represents the height of the target boundary frame;
(3) Performing target detection on each frame of picture in the video by using the model to obtain the (x, y) coordinates of the boundary frame center point of each frame of picture and the height h of the boundary frame;
(4) Converting the obtained coordinates (x, y) of the center point of the boundary frame of each frame of picture into the coordinates (x 1, y 1) of the midpoint of the upper boundary frame by using a function formula, wherein the coordinates (x 1, y 1) of the midpoint of the boundary frame of each frame of picture are sequentially connected into a track, and the function formula is a function formula directly obtained by using the geometric characteristics of the boundary frame, and the specific formula is as follows: x1=x, y1=y-1/2 h;
(5) And (3) removing the height abnormal value from the track in the step (4), and fitting the rest numerical values to obtain a functional relation between x1, y1 and h, wherein the functional relation is a fitting functional formula obtained by utilizing the data relation of x1, y1 and h, and the specific formula is as follows: h=a×y1+b, where a, b are regression coefficients;
(6) And (3) recalculating the new height h1 of the boundary box of each frame of picture in the track by utilizing the functional relation obtained in the step (5), wherein the functional relation is a functional formula directly obtained by utilizing the geometric characteristics of the boundary box, and the specific formula is as follows: x2=x1, y2=y1+h1;
(7) Calculating the coordinates (x 2, y 2) of the middle point of the lower boundary frame of each frame of picture in the track by utilizing the new height h1 in the step (6) and the functional relation between the numerical value of the coordinates (x 1, y 1) of the middle point of the upper boundary frame and the numerical value of the coordinates of the middle point of the lower boundary frame of the image, wherein the middle points of the lower boundary frames of each frame of picture are sequentially connected into a new track;
(8) Taking the new track formed in the step (7) as a target detection track;
the track jitter resisting method based on the steps realizes multi-target detection by calculating the intersection ratio of the target bounding boxes and tracking targets.
2. The method of claim 1, wherein the rectangular plane coordinate system in the step (2) uses the upper left corner of the detected picture as the origin of coordinates, uses the length direction as the X-axis, and uses the width direction as the Y-axis.
3. The method for resisting track shake based on target height prediction according to claim 1, wherein the regression coefficients a, b are calculated to obtain specific values according to track data.
4. The method of claim 1, wherein the specific formula of the cross-over ratio is iou= (a n B)/(a n B), and the cross-over ratio of the bounding box of the previous frame to all bounding boxes of the next frame is calculated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910746785.XA CN112396628B (en) | 2019-08-14 | 2019-08-14 | Track jitter resistance method based on target height prediction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910746785.XA CN112396628B (en) | 2019-08-14 | 2019-08-14 | Track jitter resistance method based on target height prediction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112396628A CN112396628A (en) | 2021-02-23 |
CN112396628B true CN112396628B (en) | 2024-05-17 |
Family
ID=74602632
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910746785.XA Active CN112396628B (en) | 2019-08-14 | 2019-08-14 | Track jitter resistance method based on target height prediction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112396628B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114119384A (en) * | 2021-09-27 | 2022-03-01 | 青岛知能知造数据科技有限公司 | Method and system for removing jitter and calibrating edges of automobile longitudinal beam images |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109117794A (en) * | 2018-08-16 | 2019-01-01 | 广东工业大学 | A kind of moving target behavior tracking method, apparatus, equipment and readable storage medium storing program for executing |
WO2019006632A1 (en) * | 2017-07-04 | 2019-01-10 | 深圳大学 | Video multi-target tracking method and device |
-
2019
- 2019-08-14 CN CN201910746785.XA patent/CN112396628B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019006632A1 (en) * | 2017-07-04 | 2019-01-10 | 深圳大学 | Video multi-target tracking method and device |
CN109117794A (en) * | 2018-08-16 | 2019-01-01 | 广东工业大学 | A kind of moving target behavior tracking method, apparatus, equipment and readable storage medium storing program for executing |
Non-Patent Citations (1)
Title |
---|
基于线性拟合的多运动目标跟踪算法;李涛;黄仁杰;李冬梅;赵雪专;焦朋伟;;西南师范大学学报(自然科学版)(第05期);第46-47页1.3节 * |
Also Published As
Publication number | Publication date |
---|---|
CN112396628A (en) | 2021-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9286678B2 (en) | Camera calibration using feature identification | |
CN113286194A (en) | Video processing method and device, electronic equipment and readable storage medium | |
US8363902B2 (en) | Moving object detection method and moving object detection apparatus | |
US20090257623A1 (en) | Generating effects in a webcam application | |
CN113723190A (en) | Multi-target tracking method for synchronous moving target | |
CN105046719B (en) | A kind of video frequency monitoring method and system | |
CN112669344A (en) | Method and device for positioning moving object, electronic equipment and storage medium | |
CN106991418B (en) | Winged insect detection method and device and terminal | |
US20150104067A1 (en) | Method and apparatus for tracking object, and method for selecting tracking feature | |
KR101681104B1 (en) | A multiple object tracking method with partial occlusion handling using salient feature points | |
Sharma | Human detection and tracking using background subtraction in visual surveillance | |
CN111161325A (en) | Three-dimensional multi-target tracking method based on Kalman filtering and LSTM | |
CN114926781A (en) | Multi-user time-space domain abnormal behavior positioning method and system supporting real-time monitoring scene | |
CN113168520A (en) | Method of tracking objects in a scene | |
CN111798486B (en) | Multi-view human motion capture method based on human motion prediction | |
Min et al. | Human fall detection using normalized shape aspect ratio | |
CN111696135A (en) | Intersection ratio-based forbidden parking detection method | |
Lira et al. | A computer-vision approach to traffic analysis over intersections | |
CN112396628B (en) | Track jitter resistance method based on target height prediction | |
Qing et al. | A novel particle filter implementation for a multiple-vehicle detection and tracking system using tail light segmentation | |
CN111144465A (en) | Multi-scene-oriented smoke detection algorithm and electronic equipment applying same | |
Verma et al. | Analysis of moving object detection and tracking in video surveillance system | |
JP2020109644A (en) | Fall detection method, fall detection apparatus, and electronic device | |
WO2019242388A1 (en) | Obstacle recognition method for library robot based on depth image | |
CN111986229A (en) | Video target detection method, device and computer system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |