CN117953546B - Pedestrian retrograde judgment method based on multi-target tracking - Google Patents

Pedestrian retrograde judgment method based on multi-target tracking Download PDF

Info

Publication number
CN117953546B
CN117953546B CN202410354175.6A CN202410354175A CN117953546B CN 117953546 B CN117953546 B CN 117953546B CN 202410354175 A CN202410354175 A CN 202410354175A CN 117953546 B CN117953546 B CN 117953546B
Authority
CN
China
Prior art keywords
track
target detection
frame
pedestrian
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410354175.6A
Other languages
Chinese (zh)
Other versions
CN117953546A (en
Inventor
张鹏
董克
李志超
李末
王泽灏
赵威
李爱华
肖景洋
李刚
吴敏思
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Elysan Electronic Technology Co ltd
Original Assignee
Shenyang Elysan Electronic Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Elysan Electronic Technology Co ltd filed Critical Shenyang Elysan Electronic Technology Co ltd
Priority to CN202410354175.6A priority Critical patent/CN117953546B/en
Publication of CN117953546A publication Critical patent/CN117953546A/en
Application granted granted Critical
Publication of CN117953546B publication Critical patent/CN117953546B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Geometry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

A pedestrian retrograde judgment method based on multi-target tracking belongs to the technical field of behavior detection, and comprises the following steps: step 1, constructing a target detection data set; step 2, constructing an improved YOLOv target detection model; step 3, training an improved YOLOv target detection model, step 4, acquiring an image frame Ft (t=1, 2, 3.) from a monitoring video stream, sending the acquired image frame into the target detection model trained in the step 3, and carrying out target detection on the input image frame to obtain a pedestrian target detection result; step 5, tracking a pedestrian target; and 6, judging the retrograde behavior. The invention increases the attention mechanism in YOLOv target detection model, improves the distinguishing capability and positioning accuracy of the model to the target object.

Description

Pedestrian retrograde judgment method based on multi-target tracking
Technical Field
The invention relates to the technical field of behavior detection, in particular to a pedestrian retrograde judgment method based on multi-target tracking.
Background
The subway station is an important component in the urban rail transit system, and a plurality of mutually connected roads are arranged in the subway station, so that smooth, convenient, fast and efficient transfer service can be provided, and the purposes of shortening the journey time and reducing the transfer cost are achieved. In order to ensure the passing efficiency, a subway station is usually provided with a unidirectional passage, an escalator and the like, and passengers need to pass in a specified direction. When the traffic is low, passengers can freely move in the passage, and a relatively wide space is maintained. However, in peak hours or in crowding, the order of the unidirectional channels is of paramount importance. The subway operator often needs to send special people for guaranteeing, so that congestion and even serious collision caused by the reverse traveling of passengers are prevented. In recent years, a subway station is generally provided with a camera monitoring device, through real-time monitoring pictures, intelligent video analysis of pedestrian retrograde detection is carried out on a single-row area, so that safety can be improved, service is optimized, operation scheduling is improved, effective emergency response is provided in an emergency situation, and the measures are helpful for improving the operation efficiency, passenger trip experience and overall safety level of the subway station.
The Chinese patent application publication No. CN115188072A discloses a pedestrian retrograde detection method and system on an escalator, and the method has the following characteristics in use: the first camera is arranged in front of the running direction of the escalator and is matched with the installation inclination angle of the escalator; secondly, marking the escalator region to be an ROI in advance; thirdly, if the YOLO system recognizes that pedestrians exist, the pedestrians are moving pedestrians, the picture resolution is 416×416, and the detection accuracy is relatively low; fourthly, analyzing the tracking video of each pedestrian by an optical flow method to generate a color value diagram, wherein 1/4 area under the frame represents the movement direction of the escalator, and 3/4 area under the frame represents the movement direction of the pedestrian. This way of demarcating the zones also brings about a certain noise effect, is not accurate enough, is sensitive to brightness and occlusion, and is sensitive to areas lacking texture or areas with repeated textures. If no obvious texture change exists in the image area, the movement information of the pixels is difficult to accurately give out in calculation; it is also possible that the same texture may be repeated, resulting in uncertainty in the calculation result.
The invention Chinese patent application publication No. CN111860282A discloses a subway section passenger flow volume statistics and pedestrian retrograde detection method and system, which are characterized in that: firstly, deepSORT is adopted for multi-target tracking, so that the manual shielding and interference are poor in performance, and misjudgment or tracking failure is easy. Is sensitive to changes in the morphology of the target, such as changes in attitude or orientation, and is prone to tracking failure. Secondly, the tripping wire needs to be arranged, and the region which is easy to be densely covered by people needs to be avoided. And judging whether retrograde movement exists according to the crossing behavior of the pedestrian target. The method can only carry out retrograde judgment at the tripwire position, has a narrow detection range, and can not effectively filter erroneous judgment caused by stay and loiter behaviors.
Chinese patent application publication No. CN110532852a discloses a method for detecting pedestrian anomaly event in subway station based on deep learning. The method is characterized in that: firstly, YOLOv and deepsort are adopted for target detection and tracking, and the shielding interference is poor. Secondly, the first 15 frames of data need to be used, and the data are accumulated for more than two times to meet the rule, so that the data are judged to be retrograde, and real-time detection cannot be achieved.
The Chinese patent application publication No. CN113361351A discloses a retrograde judgment method and a retrograde judgment system based on image recognition, which are used for recognizing the face surface of a single frame image, and the situations of lack of association of front frames and rear frames, such as position adjustment on an escalator, front-rear talking and the like, can cause a certain degree of misjudgment.
In summary, the existing pedestrian retrograde detection method still has some defects in practical application, and mainly comprises the following aspects:
1. shielding problem: when a pedestrian is shielded by other objects or pedestrians, the traditional method based on feature extraction and classification is easy to detect errors, and the direction of the pedestrian cannot be accurately inferred.
2. Complex background interference: in a complex background environment, the accuracy of pedestrian retrograde detection may be affected by background interference. For example, the blurring of the differences in color, texture, etc. between pedestrians and the background may result in increased difficulty in detection.
3. Attitude change: the existing method is sensitive to the gesture change of the pedestrians, and the pedestrians have different gestures and actions in the walking process, so that the method is a challenge for a direction detection algorithm.
Disclosure of Invention
In view of the above-mentioned shortcomings and disadvantages of the prior art, the present invention provides a pedestrian retrograde determination method based on multi-objective tracking, which increases the attention mechanism in YOLOv objective detection model, and improves the distinguishing capability and positioning accuracy of the model to the objective object.
In order to achieve the above purpose, the main technical scheme adopted by the invention comprises the following steps:
a pedestrian retrograde judgment method based on multi-target tracking comprises the following steps:
step 1, constructing a target detection data set
Collecting a network character open source dataset and monitoring pedestrian image data in a scene, marking a rectangular frame of a target of a head and a shoulder part by using LabelImg, and dividing the image which is marked and accords with the standard into a training set and a verification set;
step 2, constructing an improved YOLOv target detection model
The improved YOLOv target detection model is based on YOLOv7 network structure, a layer of convolution layer of an attention mechanism module CBAM is added between the 50 th layer and the 51 th layer, the layer is composed of 1×1 standard convolution layer and CBAM, and CBAM is added into the SPPCSPC layer of the 51 th layer of the YOLOv7 network structure;
step 3, training an improved YOLOv target detection model
Training the training set obtained in the step 1 to the target detection model constructed in the step 2 to obtain a trained target detection model;
Step 4, acquiring an image frame F t (t=1, 2, 3.) from a monitoring video stream, sending the acquired image frame into the target detection model trained in the step 3, and carrying out target detection on the input image frame to obtain a pedestrian target detection result, wherein the pedestrian target detection result comprises position information of a pedestrian marked by a target detection frame and a target detection category, and the target detection category comprises the front surface and the back surface of the pedestrian;
Step 5, pedestrian target tracking
Carrying out multi-target tracking on the pedestrian target detection result obtained in the step 4 by adopting an OC-SORT algorithm;
the step 5 specifically comprises the following steps:
Step 5.1 obtaining predictions of tracker
Center point track set of target detection frame obtained by detecting pedestrian targetCalculating each track in the image frame/>, according to the observation position of the frame before each track, by using a Kalman filterPrediction frame/>
Step 5.2 extraction of the track setVelocity vector/>For target detection frame/>And prediction frame/>Performing association;
Setting the coordinates of the center points of the latest two continuous frame positions as ,/>Then: /(I)
The associated cost is as follows:,/> to calculate IoU value and height intersection ratio between prediction box and target detection box,/> WhereinFor predicting the area of the frame,/>For detecting the area of the frame of the target,/>For predicting the height of the frame,/>For detecting the height of a frame of a target,/>For predicting the height of the frame and the target detection frame, reserving/>Results of/(I)To calculate the velocity vector of the track, the angular difference of the velocity vector formed by the previous frame of the observed position of the track and the new detected position,/>Is a weight factor according to the associated cost/>The method comprises the steps of performing linear distribution by adopting a Hungary algorithm, obtaining the optimal matching of a prediction frame and a target detection frame, and returning three types of results: /(I)Matching trajectories,/>Unmatched tracks,An unmatched target detection frame;
Step 5.3 if there is an unmatched target detection frame And unmatched track/>Performing the second round of association to the unmatched track/>Calculating the remaining target detection frame/>And unmatched track/>Last observed position/>Associated costsReservation/>The result of (2) is linearly distributed by adopting a Hungary algorithm to obtain the optimal matching of the prediction frame and the target detection frame, and update/>、/>
Step 5.4 track update
For the detection result and the track which are successfully matched, the track is updated by using the detection resultSpeed vector and observation position of a tracker, unmatched trajectory/>1 Is added to the unmatched count of (2);
Step 5.5 creation of a new track
For unmatched detection resultsCreate new tracks and trackers and initialize/>The position is observed for the first frame in the new track.
Step 6, retrograde behavior determination
Traversing all trajectoriesTrack/>, with length of record of screening observation position being more than 3Extraction/>The initial observation position box 0 and the observation position box t at the moment t of each track, the displacement variation of the target in the track is calculated aiming at the observation position extracted from one track, and whether the track has retrograde behavior is judged by combining the detection type of the target and the result of the speed vector.
Further, in the step 2, the number of output channels is 1024 in the convolution layers added between the 50 th layer and the 51 th layer, the reduction coefficient reduction=16 of CBAM added between the 50 th layer and the 51 th layer, and the convolution kernel size k=49.
Further, in the step 2, the attention mechanism module CBAM is composed of a channel attention module and a spatial attention module, where the channel attention module includes a full-connection layer and two layers of ReLU activation functions, and the spatial attention module includes a convolution layer and a sigmoid activation function.
Further, in the step 3, the training method is YOLOv model training method, the training parameters are iteration number 400, image size 640, number of batch pictures 32, and learning rate 0.01.
Further, in the step 5, the OC-SORT algorithm predicts and updates the state of the target track by using a multi-target tracker based on a kalman filter, and matches the state with the track direction consistency through HIoU.
Further, if the current moment track is setIf the air is empty, step 5.5 is performed.
Further, step 5.6 is included to delete the track if there is a mismatch count of the track reaching 30.
Further, the judging method for judging whether the track has the retrograde behavior in the step 6 is as follows: the displacement variation is greater than 0.25, the counting ratio of the front category in track record is greater than 0.7, the target detection category in the last observation position is the front, and the speed vector of the trackPresence/>A trajectory satisfying the above condition is determined to be retrograde.
The beneficial effects of the invention are as follows:
1. The invention adopts an improved YOLOv target detection model, increases a target detection algorithm of a attention mechanism, can extract richer image features, and improves the distinguishing capability and positioning precision of the model to a target object.
2. The method adopts the OC-SORT algorithm to track multiple targets, and can improve the tracking capability of targets in crowded and shielding scenes.
3. According to the tracking result, the retrograde judgment can be performed by only 3 frames at least, the real-time requirement can be met, the retrograde behavior of the pedestrian is comprehensively judged by combining the behavior characteristics of the human body, and the judgment result is more accurate.
4. The method has high portability, and can well run for a new scene by only needing a small amount of debugging work.
Detailed Description
The present invention will be described in detail below with reference to specific embodiments for better explaining the present invention.
The invention provides a pedestrian retrograde judgment method based on multi-target tracking, which is applied to a field where whether retrograde behaviors exist on pedestrians is judged, such as: and the subway unidirectional channel is used for carrying out retrograde judgment on the behavior that pedestrians face the camera and approach the camera.
The method comprises the following steps:
step 1, constructing a target detection data set
Collecting an open source data set of a network character and pedestrian image data in a monitoring scene, specifically collecting 5500 images, marking a rectangular frame of a target of a head and a shoulder part by using a picture marking tool LabelImg, and dividing the images which are marked and meet the standard into a training set and a verification set.
Specifically, rectangular box labels fall into three categories: front, back, others, wherein the front includes the complete five sense organs, the back is the complete back, and the rest angles are others.
Step 2, constructing an improved YOLOv target detection model
The improved YOLOv target detection model is based on YOLOv7 network structure, and between the 50 th layer and the 51 th layer, a layer of convolution layers of an attention mechanism module CBAM is added, wherein the layer is composed of 1×1 standard convolution layers and CBAM, specifically, the output channel number is 1024, the reduced-dimension coefficient of cbam=16, and the convolution kernel size k=49. CBAM is added into SPPCSPC layer of the YOLOv network structure 51 th layer; specifically, the reduced-dimension coefficient reduction=16, and the convolution kernel size k=49.
Specifically, the attention mechanism module CBAM is composed of two parts, a channel attention module and a spatial attention module. The channel attention module comprises a full connection layer and two layers of ReLU activation functions, and the spatial attention module comprises a convolution layer and a sigmoid activation function.
Step 3, training an improved YOLOv target detection model
And (3) training the target detection model constructed in the step (2) by the training set obtained in the step (1) to obtain a trained target detection model. Specifically, the training method is YOLOv model training method, the training parameters are iteration times 400, image size 640, batch picture number 32, and learning rate 0.01.
And 4, acquiring an image frame F t (t=1, 2, 3.) from the monitoring video stream, sending the acquired image frame into the target detection model trained in the step 3, and carrying out target detection on the input image frame to obtain a pedestrian target detection result, wherein the pedestrian target detection result comprises pedestrian position information marked by a target detection frame and a target detection category, and the target detection category comprises the front surface and the back surface of the pedestrian. The model can be continuously detected after training.
Step 5, pedestrian target tracking
And (3) carrying out multi-target tracking on the pedestrian target detection result obtained in the step (4) by adopting an OC-SORT algorithm. The OC-SORT algorithm adopts a multi-target tracker based on a Kalman filter to predict and update the state of a target track, and matches the target track through HIoU and track direction consistency.
Specifically, the step 5 specifically includes the following steps:
Step 5.1 obtaining predictions of tracker
Center point track set of target detection frame obtained by detecting pedestrian targetCalculating each track in the image frame/>, according to the observation position of the frame before each track, by using a Kalman filterPrediction frame/>. Track set/>The method is characterized in that the records of the coordinate positions of the center points of all pedestrians which are successfully matched in each frame of image and the detection category results of the corresponding positions are relatively independent, and unique IDs (i) are allocated for distinguishing different pedestrian targets.
Specifically, the step uses a Kalman filter to approximately represent the displacement of each pedestrian target between frames by using a linear uniform velocity model, and the related parameters are as follows:
Modeling the status of pedestrian targets as Wherein/>The state vector of the target, u and v are respectively the horizontal and vertical coordinates of the center point of the target detection frame, s and r are respectively the area and the length-width ratio of the target detection frame,The time derivatives of u, v, s, respectively.
Observing the vector D, converting the vector D into an initial bounding box obtained by target detection
The covariance matrix is initially set to
Process noise covariance
Measuring noise covariance
According to the prediction method of Kalman filtering, the state information x t-1 at the moment t-1 of the track is used to obtain the predicted position of the track at the moment t, namely a predicted frame
If the current moment trace is setIf the air is empty, step 5.5 is directly performed.
Step 5.2 extraction of the track setVelocity vector/>For target detection frame/>And prediction frame/>Performing association;
the velocity vector V t is calculated by using the position change of two consecutive frames in the track, specifically by calculating the difference between the center points of the latest two consecutive frames, and normalizing to obtain the velocity vector. Setting the coordinates of the center points of the latest two continuous frame positions as ,/>
Then:
The associated cost is as follows: ,/> to calculate IoU value and height intersection ratio between prediction box and target detection box,/> WhereinFor predicting the area of the frame,/>For detecting the area of the frame of the target,/>For predicting the height of the frame,/>For detecting the height of a frame of a target,/>For predicting the height of the frame and the target detection frame, reserving/>Results of/(I)To calculate the velocity vector of the track, the angular difference of the velocity vector formed by the previous frame of the observed position of the track and the new detected position,/>Is a weight factor, in particular,/>. According to the associated cost/>The method comprises the steps of performing linear distribution by adopting a Hungary algorithm, obtaining the optimal matching of a prediction frame and a target detection frame, and returning three types of results: /(I)Matching trajectories,/>Unmatched tracks,/>An unmatched target detection frame;
Step 5.3 if there is an unmatched target detection frame And unmatched track/>Performing the second round of association to the unmatched track/>Calculating the remaining target detection frame/>And unmatched track/>Last observed position/>Associated costsReservation/>The result of (2) is linearly distributed by adopting a Hungary algorithm to obtain the optimal matching of the prediction frame and the target detection frame, and update/>、/>
Step 5.4 track update
For the detection result and the track which are successfully matched, the track is updated by using the detection resultSpeed vector and observation position of a tracker, unmatched trajectory/>1 Is added to the unmatched count of (2);
Step 5.5 creation of a new track
For unmatched detection resultsCreate new tracks and trackers and initialize/>The position is observed for the first frame in the new track.
Step 5.6 if there is a mismatch count of 30 for the track, indicating that the track has not been updated for the set time, the track is deleted.
Step 6, retrograde behavior determination
Traversing all trajectoriesTrack/>, with length of record of screening observation position being more than 3Extraction/>The initial observation position box 0 and the observation position box t at the moment t of each track, the displacement variation of the target in the track is calculated aiming at the observation position extracted from one track, and whether the track has retrograde behavior is judged by combining the detection type of the target and the result of the speed vector. Observation position/>Where u is the abscissa of the center point of the target detection frame, v is the ordinate of the center point of the target detection frame, w is the width of the target detection frame, h is the height of the target detection frame, score is the confidence level, and cls is the detection class. In an actual scene, the height of the target detection frame does not greatly fluctuate due to shielding, limb actions and other conditions, and the real size of the target can be well reflected. The following calculation is performed for the observation position extracted from one track:
Calculating displacement variation: Wherein v t is the ordinate of the center point of the target detection frame at the time t, v 0 is the ordinate of the center point of the target detection frame at the time of the initial time, h t is the height of the target detection frame at the time t, and h 0 is the height of the target detection frame at the time of the initial time;
calculating count duty cycle of frontal category in track log Where count represents the count.
The judging method for judging whether the track has the retrograde behavior comprises the following steps: displacement change amount shift is more than 0.25, and counting ratio of front category in track recordAnd the last observation position, namely the observation position at the moment t, is the front of the target detection class cls t, and the velocity vector/>, at the moment t, of the trackPresence/>A trajectory satisfying the above condition is determined to be retrograde.
While embodiments of the present invention have been shown and described above, it should be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that alterations, modifications, substitutions and variations may be made by those of ordinary skill in the art without departing from the scope of the invention.

Claims (6)

1. A pedestrian retrograde judgment method based on multi-target tracking is characterized by comprising the following steps:
step 1, constructing a target detection data set
Collecting a network character open source dataset and monitoring pedestrian image data in a scene, marking a rectangular frame of a target of a head and a shoulder part by using LabelImg, and dividing the image which is marked and accords with the standard into a training set and a verification set;
step 2, constructing an improved YOLOv target detection model
The improved YOLOv target detection model is based on YOLOv7 network structure, a layer of convolution layer of an attention mechanism module CBAM is added between the 50 th layer and the 51 th layer, the layer is composed of 1×1 standard convolution layer and CBAM, and CBAM is added into the SPPCSPC layer of the 51 th layer of the YOLOv7 network structure;
step 3, training an improved YOLOv target detection model
Training the training set obtained in the step 1 to the target detection model constructed in the step 2 to obtain a trained target detection model;
Step 4, acquiring an image frame F t (t=1, 2, 3.) from a monitoring video stream, sending the acquired image frame into the target detection model trained in the step 3, and carrying out target detection on the input image frame to obtain a pedestrian target detection result, wherein the pedestrian target detection result comprises position information of a pedestrian marked by a target detection frame and a target detection category, and the target detection category comprises the front surface and the back surface of the pedestrian;
Step 5, pedestrian target tracking
Carrying out multi-target tracking on the pedestrian target detection result obtained in the step 4 by adopting an OC-SORT algorithm;
the step 5 specifically comprises the following steps:
Step 5.1 obtaining predictions of tracker
Center point track set of target detection frame obtained by detecting pedestrian targetCalculating each track in the image frame/>, according to the observation position of the frame before each track, by using a Kalman filterPrediction frame/>
If the current moment track is setIf the air is empty, executing the step 5.5;
Step 5.2 extraction of the track set Velocity vector/>For target detection frame/>And prediction frame/>Performing association;
Setting the coordinates of the center points of the latest two continuous frame positions as ,/>Then:
The associated cost is as follows: ,/> to calculate IoU value and height intersection ratio between prediction box and target detection box,/> Wherein/>For predicting the area of the frame,/>For detecting the area of the frame of the target,/>For predicting the height of the frame,/>The height of the target detection frame,For predicting the height of the frame and the target detection frame, reserving/>Results of/(I)To calculate the velocity vector of the track, the angular difference of the velocity vector formed by the previous frame of the observed position of the track and the new detected position,/>Is a weight factor according to the associated cost/>The method comprises the steps of performing linear distribution by adopting a Hungary algorithm, obtaining the optimal matching of a prediction frame and a target detection frame, and returning three types of results: /(I)Matching trajectories,/>Unmatched tracks,An unmatched target detection frame;
Step 5.3 if there is an unmatched target detection frame And unmatched track/>Performing the second round of association to the unmatched track/>Calculating the remaining target detection frame/>And unmatched tracksLast observed position/>Associated costsReservation/>The result of (2) is linearly distributed by adopting a Hungary algorithm to obtain the optimal matching of the prediction frame and the target detection frame, and update/>、/>
Step 5.4 track update
For the detection result and the track which are successfully matched, the track is updated by using the detection resultSpeed vector and observation position of a tracker, unmatched trajectory/>1 Is added to the unmatched count of (2);
Step 5.5 creation of a new track
For unmatched detection resultsCreate new tracks and trackers and initialize/>Observing the position for a first frame in the new track;
Step 6, retrograde behavior determination
Traversing all trajectoriesTrack/>, with length of record of screening observation position being more than 3Extraction/>The initial observation position box 0 and the t moment observation position box t of each track, aiming at the observation position extracted from one track, calculating the displacement variation of a target in the track, and judging whether the track has retrograde behavior according to the result of the target detection category and the speed vector; the judging method for judging whether the track has the retrograde behavior in the step 6 is as follows: the displacement variation is greater than 0.25, the counting ratio of the front category in track record is greater than 0.7, the target detection category in the last observation position is the front, and the speed vector/>, of the trackPresence/>A trajectory satisfying the above condition is determined to be retrograde.
2. The pedestrian retrograde judgment method based on multi-objective tracking according to claim 1, wherein: in the step 2, the number of output channels of the convolution layers added between the 50 th layer and the 51 th layer is 1024, the dimension reduction coefficient reduction=16 of CBAM added between the 50 th layer and the 51 th layer is increased, and the convolution kernel size k=49.
3. The pedestrian retrograde judgment method based on multi-objective tracking according to claim 1, wherein: in the step 2, the attention mechanism module CBAM is composed of a channel attention module and a spatial attention module, wherein the channel attention module comprises a full connection layer and two layers of ReLU activation functions, and the spatial attention module comprises a convolution layer and a sigmoid activation function.
4. The pedestrian retrograde judgment method based on multi-objective tracking according to claim 1, wherein: in the step 3, the training method is YOLOv model training method, the training parameters are iteration number 400, image size 640, batch picture number 32, and learning rate 0.01.
5. The pedestrian retrograde judgment method based on multi-objective tracking according to claim 1, wherein: in the step 5, the OC-SORT algorithm adopts a multi-target tracker based on a kalman filter to predict and update the state of the target track, and matches the track direction consistency through HIoU.
6. The pedestrian retrograde judgment method based on multi-objective tracking according to claim 1, wherein: step 5.6 is also included, if there is a mismatch count of tracks reaching 30, deleting the track.
CN202410354175.6A 2024-03-27 2024-03-27 Pedestrian retrograde judgment method based on multi-target tracking Active CN117953546B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410354175.6A CN117953546B (en) 2024-03-27 2024-03-27 Pedestrian retrograde judgment method based on multi-target tracking

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410354175.6A CN117953546B (en) 2024-03-27 2024-03-27 Pedestrian retrograde judgment method based on multi-target tracking

Publications (2)

Publication Number Publication Date
CN117953546A CN117953546A (en) 2024-04-30
CN117953546B true CN117953546B (en) 2024-06-11

Family

ID=90792507

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410354175.6A Active CN117953546B (en) 2024-03-27 2024-03-27 Pedestrian retrograde judgment method based on multi-target tracking

Country Status (1)

Country Link
CN (1) CN117953546B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860282A (en) * 2020-07-15 2020-10-30 中国电子科技集团公司第三十八研究所 Subway section passenger flow volume statistics and pedestrian retrograde motion detection method and system
CN114529799A (en) * 2022-01-06 2022-05-24 浙江工业大学 Aircraft multi-target tracking method based on improved YOLOV5 algorithm
WO2023124133A1 (en) * 2021-12-29 2023-07-06 上海商汤智能科技有限公司 Traffic behavior detection method and apparatus, electronic device, storage medium, and computer program product

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860282A (en) * 2020-07-15 2020-10-30 中国电子科技集团公司第三十八研究所 Subway section passenger flow volume statistics and pedestrian retrograde motion detection method and system
WO2023124133A1 (en) * 2021-12-29 2023-07-06 上海商汤智能科技有限公司 Traffic behavior detection method and apparatus, electronic device, storage medium, and computer program product
CN114529799A (en) * 2022-01-06 2022-05-24 浙江工业大学 Aircraft multi-target tracking method based on improved YOLOV5 algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Dark-SORT: Multi-Person Tracking in Underground Coal Mines Using Adaptive Discrete Weighting;Rui Wang et al.;《IEEE Access》;20231207;第11卷;全文 *
Experiment study on pedestrian abnormal behavior detection and crowd stability analysis in cross passages;Cuiling Li et al.;《International Workshop on Automation, Control, and Communication Engineering (IWACCE 2022)》;20221209;第12492卷;全文 *
基于交通视频的路网运行状态智能监控***研究;李雪倩;《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》;20210315(第3期);全文 *

Also Published As

Publication number Publication date
CN117953546A (en) 2024-04-30

Similar Documents

Publication Publication Date Title
CN111144247B (en) Escalator passenger reverse detection method based on deep learning
CN108320510B (en) Traffic information statistical method and system based on aerial video shot by unmanned aerial vehicle
CN110472496B (en) Traffic video intelligent analysis method based on target detection and tracking
CN107818571B (en) Ship automatic tracking method and system based on deep learning network and average drifting
CN108416250B (en) People counting method and device
CN109948582B (en) Intelligent vehicle reverse running detection method based on tracking trajectory analysis
CN103530874B (en) People stream counting method based on Kinect
Hoogendoorn et al. Extracting microscopic pedestrian characteristics from video data
CN112750150B (en) Vehicle flow statistical method based on vehicle detection and multi-target tracking
CN109344690B (en) People counting method based on depth camera
CN108776974B (en) A kind of real-time modeling method method suitable for public transport scene
Giannakeris et al. Speed estimation and abnormality detection from surveillance cameras
TW201324383A (en) Method and apparatus for video analytics based object counting
CN109325404A (en) A kind of demographic method under public transport scene
US8599261B1 (en) Vision-based car counting for multi-story carparks
CN111881749B (en) Bidirectional people flow statistics method based on RGB-D multi-mode data
CN105513342A (en) Video-tracking-based vehicle queuing length calculating method
CN113743260B (en) Pedestrian tracking method under condition of dense pedestrian flow of subway platform
CN111402632B (en) Risk prediction method for pedestrian movement track at intersection
CN113092807B (en) Urban overhead road vehicle speed measuring method based on multi-target tracking algorithm
CN115303901B (en) Elevator traffic flow identification method based on computer vision
Meng et al. Video‐Based Vehicle Counting for Expressway: A Novel Approach Based on Vehicle Detection and Correlation‐Matched Tracking Using Image Data from PTZ Cameras
CN114926422B (en) Method and system for detecting passenger flow of getting on and off vehicles
CN111797738A (en) Multi-target traffic behavior fast extraction method based on video identification
CN117953546B (en) Pedestrian retrograde judgment method based on multi-target tracking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant