CN114550094A - Method and system for flow statistics and manned judgment of tricycle - Google Patents

Method and system for flow statistics and manned judgment of tricycle Download PDF

Info

Publication number
CN114550094A
CN114550094A CN202210177738.XA CN202210177738A CN114550094A CN 114550094 A CN114550094 A CN 114550094A CN 202210177738 A CN202210177738 A CN 202210177738A CN 114550094 A CN114550094 A CN 114550094A
Authority
CN
China
Prior art keywords
tricycle
rectangular frame
coordinates
human body
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210177738.XA
Other languages
Chinese (zh)
Inventor
房思思
甘彤
商国军
任好
杨利红
程剑
张琦珺
刘海涛
卢安安
唐亮
凌虎
马彪彪
刘正丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 38 Research Institute
Original Assignee
CETC 38 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 38 Research Institute filed Critical CETC 38 Research Institute
Priority to CN202210177738.XA priority Critical patent/CN114550094A/en
Publication of CN114550094A publication Critical patent/CN114550094A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/065Traffic control systems for road vehicles by counting the vehicles in a section of the road or in a parking area, i.e. comparing incoming count with outgoing count
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30242Counting objects in image

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a system for flow statistics and manned judgment of a tricycle, wherein the method comprises the following steps: acquiring image data for monitoring a traffic scene; constructing a target tracking model of the tricycle, inputting image data into the target tracking model, and obtaining the coordinates of a rectangular frame of the tricycle in an image and the ID number of the tricycle; constructing a human body detection model on the tricycle according to the rectangular frame coordinates of the tricycle in the image, and obtaining the rectangular frame coordinates of the human body according to the image area containing the tricycle; and judging whether people are carried by the position relation between the human body rectangular frame coordinate and the tricycle rectangular frame coordinate. A rectangular frame area A perpendicular to the running direction of the tricycle flow is arranged to judge the tricycle flow. The tricycle region in the video frame is positioned by a method combining deep learning and a tracking algorithm, so that the problem of low accuracy in judging illegal people carrying of tricycles in a road traffic scene is solved, and the statistics of tricycle flow in a certain region is realized.

Description

Method and system for flow statistics and manned judgment of tricycle
Technical Field
The invention relates to the technical field of target detection and tracking in intelligent transportation, in particular to a method and a system for tricycle flow statistics and manned judgment.
Background
With the development of automation and artificial intelligence, concepts such as intelligent transportation and smart cities are proposed, and at present, high-definition cameras are mostly installed on expressways, urban roads and major arterial roads of villages and towns in China, so that passing motor vehicles, non-motor vehicles, pedestrians and the like can be captured and recorded. By looking at the video recordings, we find that a large number of tricycles are still driving per day, even in situations where the tricycles are not allowed to carry people as dictated by relevant laws. According to the knowledge, the traffic accidents caused by illegal man carrying of the tricycle in China are many every year, so that the high-definition camera installed on the road is combined and the video image is processed, the important reference value is provided for preventing the illegal man carrying condition of the tricycle, and the flow of the tricycle can be monitored in real time.
Through research and study, the situation of illegal people carrying of the tricycle is mainly judged by people, but obviously, the method has low accuracy, is time-consuming and needs more people.
Also disclosed as application No. 202010697598.X is a method for human-on-board identification in conjunction with a convolutional neural network. The person-carrying identification of the tricycle is carried out by detecting a tricycle area in an image and then detecting the number of human bodies in the tricycle area. However, the idea disclosed in this patent does not number and track different tricycle targets, so that tricycle flow statistics and illegal human carrying of tricycles cannot be realized at the same time; meanwhile, when human body detection is carried out, the size of an image captured by the original camera is smaller in the area where the tricycle is located, and therefore the detection effect can be influenced when human body detection is directly carried out in the area where the tricycle is located.
Disclosure of Invention
The invention aims to solve the technical problems of how to count the flow of the tricycle in real time and intelligently judge the passenger carrying condition of the tricycle.
The invention solves the technical problems through the following technical means:
a method for flow statistics and manned judgment of a tricycle comprises the following steps:
s1: acquiring image data for monitoring a traffic scene;
s2: constructing a target tracking model of the tricycle, inputting the image data into the target tracking model, and obtaining the coordinates of a rectangular frame of the tricycle in the image and the ID number of the tricycle;
s3: acquiring an image area containing the tricycle according to the coordinates of the rectangular frame of the tricycle in the image;
s4: constructing a human body detection model on the tricycle, and inputting the image area containing the tricycle into the human body detection model to obtain the coordinates of a human body rectangular frame;
s5: whether the tricycle carries people is judged by judging the position relation between the coordinates of the human body rectangular frame and the coordinates of the corresponding tricycle rectangular frame, and the method specifically comprises the following steps of: mapping the coordinates of the human body rectangular frame to an original input image; converting the coordinates of the rectangular frame of the corresponding tricycle into coordinates of an upper left corner and an lower right corner; when the horizontal and vertical coordinates of the lower right corner of the human body rectangular frame are respectively between the horizontal and vertical coordinates of the upper left corner and the lower right corner of the corresponding tricycle, adding one to the number of human bodies on the corresponding tricycle; if the number of the human bodies on the corresponding tricycle is more than or equal to 2, the tricycle is considered to be carrying people; otherwise, the tricycle is considered to be not carrying a person.
S6: and setting a rectangular frame area A perpendicular to the running direction of the tricycle flow, and counting the tricycle flow by judging whether the tricycle passes through a certain side of the area A perpendicular to the flow direction.
The method comprises the steps of establishing a target tracking model and a human body detection model of the tricycle, respectively detecting the tricycle and a human body, and numbering the tricycle. The number of people on the tricycle is judged by judging the relation between the coordinates of the human body rectangular frame and the coordinates of the tricycle rectangular frame, so that whether the tricycle carries people or not is judged, and the judgment precision is high; and counting the flow of the tricycle by judging whether the tricycle passes through a rectangular area vertical to the direction of the flow.
Further, the constructing a target tracking model of the tricycle in the step 2 specifically includes: (1) manufacturing a tricycle detection data set: performing frame extraction, sampling and deleting on video data acquired from a traffic scene, wherein the video data do not contain an image frame of a tricycle, marking the position of the tricycle in a data set by using a rectangular frame, and giving a category of the tricycle; (2) improving the final output of the network into target frame coordinates, target confidence and target feature vectors by connecting a set of convolution layers (3 x 3 convolution, batch normalization layer and activation function layer) at each output level of a YOLO series target detection network (YOLOv3, YOLOv4, YOLOv5 and the like); (3) training a tricycle detection model by adopting an improved YOLO series network according to a tricycle detection data set; (4) and obtaining a target tracking model of the tricycle by combining a detection model, a Kalman filtering algorithm and a matching algorithm of the tricycle.
Further, the coordinates of the rectangular frame of the tricycle in the image and the ID number of the tricycle in the step 2 specifically include: the coordinates of the rectangular frame are the horizontal and vertical coordinates and the width and the height of the upper left corner of the rectangular frame; and comparing the confidence coefficient of the rectangular frame with a set detection threshold of the tricycle, and when the confidence coefficient of the rectangular frame is greater than the threshold, determining that the rectangular frame is an effective tricycle rectangular frame, distinguishing different tricycles through a tracking model, and allocating different ID numbers to each effective tricycle rectangular frame.
Further, the step 3 includes an image area of the tricycle, and specifically includes: (1) respectively subtracting a settable external expansion pixel value from the horizontal coordinate and the vertical coordinate of the upper left corner of the rectangular frame of the tricycle obtained by the target tracking model to obtain the coordinate value of the upper left corner of the tricycle image area; (2) adding the extended pixel value and the width and height of the tricycle rectangular frame to the horizontal coordinate and the vertical coordinate of the upper left corner of the tricycle rectangular frame obtained by the target tracking model to obtain the coordinate value of the lower right corner of the tricycle image area; (3) and intercepting and storing the image area of the tricycle in the original input image by adopting the coordinate value of the upper left corner and the coordinate value of the lower right corner of the image area of the tricycle.
Further, the constructing of the human body detection model on the tricycle in the step 4 specifically includes: (1) and manufacturing a human body detection data set according to the intercepted and stored image area of the tricycle: marking the position of a human body on the tricycle by using a rectangular frame and giving a category of the human body; (2) and training a human body detection model by adopting a deep convolution neural network according to the human body detection data set.
Further, the coordinates of the rectangular frame of the human body in the step 4 specifically include: the coordinates of the rectangular frame are the horizontal and vertical coordinates of the upper left corner and the horizontal and vertical coordinates of the lower right corner of the rectangular frame; and comparing the confidence coefficient of the rectangular frame with a set human body detection threshold, and when the confidence coefficient of the rectangular frame is greater than the threshold, determining that the rectangular frame is an effective human body rectangular frame.
Further, in step 6, the flow rate of the tricycle is counted by judging whether the tricycle passes through a certain side of the area a perpendicular to the direction of the flow of the tricycle, and the method specifically includes: (1) one of two sides of the area A vertical to the traffic flow direction and close to the tricycle is marked as L1, and two sides of the area A parallel to the traffic flow direction are respectively marked as L2 and L3; (2) and judging whether the ordinate of the upper left corner and the ordinate of the lower right corner of the rectangular frame of a certain tricycle are on two sides of the L1 and whether the abscissa of the upper left corner or the abscissa of the lower right corner of the rectangular frame of the tricycle is between the L2 and the L3, if the ordinate and the abscissa of the lower left corner are on two sides of the rectangular frame of the tricycle, storing the ID number and the coordinate frame information of the tricycle and carrying out tricycle flow statistics.
Corresponding to the method, the invention also provides a tricycle flow counting and manned judging system, which comprises:
the data collection module is used for acquiring image data for monitoring the traffic scene;
the tricycle rectangular frame and number acquisition module inputs the image data into a target tracking model to acquire the coordinates of the tricycle rectangular frame in the image and the number of the tricycle;
the tricycle image area acquisition module is used for acquiring an image area containing the tricycle according to the coordinates of a rectangular frame of the tricycle in an image;
the human body rectangular frame acquisition module on the tricycle constructs a human body detection model on the tricycle, and the image area containing the tricycle is input into the human body detection model to obtain the coordinates of the human body rectangular frame;
and the module for judging whether the tricycle carries people judges whether the tricycle carries people or not by judging the position relation between the coordinates of the human body rectangular frame and the coordinates of the corresponding tricycle rectangular frame.
The tricycle flow counting module is provided with a rectangular frame area A perpendicular to the tricycle flow running direction, and counts the tricycle flow by judging whether the tricycle passes through a certain side of the area A perpendicular to the tricycle flow direction.
Further, the module is obtained with serial number to the rectangle frame of tricycle, specifically includes: (1) manufacturing a tricycle detection data set: performing frame extraction, sampling and deleting on video data acquired from a traffic scene, wherein the video data do not contain an image frame of a tricycle, marking the position of the tricycle in a data set by using a rectangular frame, and giving a category of the tricycle; (2) improving the final output of the network into target frame coordinates, target confidence and target feature vectors by connecting a set of convolution layers (3 x 3 convolution, batch normalization layer and activation function layer) at each output level of a YOLO series target detection network (YOLOv3, YOLOv4, YOLOv5 and the like); (3) training a tricycle detection model by adopting an improved YOLO series network according to a tricycle detection data set; (4) and obtaining a target tracking model of the tricycle by combining a detection model, a Kalman filtering algorithm and a matching algorithm of the tricycle. The coordinates of the rectangular frame are the horizontal and vertical coordinates and the width and the height of the upper left corner of the rectangular frame; and comparing the confidence coefficient of the rectangular frame with a set detection threshold value of the tricycle, and when the confidence coefficient of the rectangular frame is greater than the threshold value, determining that the tricycle is a valid tricycle rectangular frame, distinguishing different tricycles through a tracking model, and assigning different ID numbers to each valid tricycle rectangular frame.
Further, the module for acquiring the image area of the tricycle specifically comprises: (1) respectively subtracting a settable external expansion pixel value from the horizontal coordinate and the vertical coordinate of the upper left corner of the rectangular frame of the tricycle obtained by the target tracking model to obtain the coordinate value of the upper left corner of the tricycle image area; (2) adding the extended pixel value and the width and height of the tricycle rectangular frame to the horizontal coordinate and the vertical coordinate of the upper left corner of the tricycle rectangular frame obtained by the target tracking model to obtain the coordinate value of the lower right corner of the tricycle image area; (3) and intercepting and storing the image area of the tricycle in the original input image by adopting the coordinate value of the upper left corner and the coordinate value of the lower right corner of the image area of the tricycle.
Further, the module for judging whether the tricycle carries a person specifically comprises: mapping the coordinates of the human body rectangular frame to an original input image; converting the coordinates of the rectangular frame of the corresponding tricycle into coordinates of an upper left corner and an lower right corner; when the horizontal and vertical coordinates of the lower right corner of the human body rectangular frame are respectively between the horizontal and vertical coordinates of the upper left corner and the lower right corner of the corresponding tricycle, adding one to the number of human bodies on the corresponding tricycle; if the number of the human bodies on the corresponding tricycle is more than or equal to 2, the tricycle is considered to be carrying people; otherwise, the tricycle is considered to be not carrying a person.
Further, the tricycle flow statistics module specifically includes: (1) one of two sides of the area A vertical to the traffic flow direction and close to the tricycle is marked as L1, and two sides of the area A parallel to the traffic flow direction are respectively marked as L2 and L3; (2) and judging whether the ordinate of the upper left corner and the ordinate of the lower right corner of the rectangular frame of a certain tricycle are on two sides of the L1 and whether the abscissa of the upper left corner or the abscissa of the lower right corner of the rectangular frame of the tricycle is between the L2 and the L3, if the ordinate and the abscissa of the lower left corner are on two sides of the rectangular frame of the tricycle, storing the ID number and the coordinate frame information of the tricycle and carrying out tricycle flow statistics.
The invention has the advantages that:
the method comprises the steps of applying a multi-target tracking algorithm and a multi-target detection algorithm based on a deep convolutional neural network to a road traffic scene, and detecting a tricycle and a human body respectively by establishing a target tracking model and a human body detection model of the tricycle. The number of people on the tricycle is judged by judging the relation between the coordinates of the human body rectangular frame and the coordinates of the tricycle rectangular frame, so that whether the tricycle carries people or not is judged, and the judgment precision is high; meanwhile, the flow of the tricycle is counted by judging whether the tricycle passes through a rectangular area perpendicular to the traffic flow direction. The problem of low accuracy of illegal man-carrying judgment of the tricycle in a road traffic scene is solved, and the flow counting function of the tricycle is realized.
Drawings
FIG. 1 is a flow chart of a method for flow statistics and occupant determination for a tricycle in accordance with an embodiment of the present invention.
Fig. 2 is a schematic diagram of a tricycle tracking algorithm according to an embodiment of the present invention.
FIG. 3 is a schematic diagram of a tricycle detection algorithm according to an embodiment of the invention.
FIG. 4 is a schematic diagram of a human detection algorithm in one embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a method for flow statistics and passenger carrying determination of a tricycle, according to an embodiment of the present invention, including the steps of:
s1: and acquiring image data for monitoring the traffic scene.
In the embodiment, video data of traffic scenes at 7 intersections are acquired, and 20000 images containing three wheelers are collected by performing frame extraction, sampling, renaming and deleting image frames not containing three wheelers on the video; and labeling the tricycle target in a rectangular frame format by using a labeling tool, and converting the representation form of the labeled tricycle rectangular frame into a (class _ id, x y, w, h) format, wherein the class _ id is the category number of the tricycle, x is the horizontal coordinate of the normalized central point, y is the vertical coordinate of the normalized central point, w is the width of the normalized rectangular frame, and h is the height of the normalized rectangular frame.
S2: and constructing a target tracking model of the tricycle, inputting the image data into the target tracking model, and obtaining the coordinates of a rectangular frame of the tricycle in the image and the ID number of the tricycle.
In this embodiment, a target tracking model of the tricycle is constructed by using a target tracking algorithm consisting of a one-stage target detection network, feature matching, detection frame IOU (cross-over ratio) matching and Kalman filtering. The original output of the network is changed into target box coordinates, target confidence, target class and target feature vector by associating a set of convolution layers (3 × 3 convolution, batch normalization layer, activation function layer) at each output level of the YOLO series network target detection network (YOLOv3, YOLOv4, YOLOv5, etc.). Specifically, the YOLO series network is mainly composed of a backbone network and a plurality of YOLO layers. If the extracted feature dimension of the tricycle is D, the output dimension of the network is (6A + D) multiplied by H multiplied by W, A is the number of anchors corresponding to each yolo layer, and the number of anchors is set to be 4; h and W are respectively the height and width of the image output by the corresponding yolo layer; and 6A represents that each anchor corresponds to 6 parameters, namely the category id, the confidence coefficient, the horizontal and vertical coordinates (x, y) of the upper left corner of the rectangular frame of the tricycle, and the width w and the height h of the rectangular frame.
The initialization of track _ id (namely the number of different tricycles) and the initialization of Kalman parameters are carried out on the tricycle rectangular frame of the last frame of image output from the YOLO series network. And calculating characteristic distances between the tricycle rectangular frame output by the current frame image from the YOLO series network and the tricycle rectangular frame with the track _ id of the previous frame, matching the characteristic distances, and updating the tricycle rectangular frame with the track _ id of the previous frame matched with the tricycle rectangular frame with the track _ id of the current frame by adopting the tricycle rectangular frame information corresponding to the current frame. And calculating the IOU distance between the tricycle rectangular frame with the track _ id of the previous frame which is not matched and the tricycle rectangular frame which is not matched with the current frame, matching, and updating the tricycle rectangular frame with the track _ id of the previous frame which is matched with the tricycle rectangular frame with the track _ id of the current frame by adopting the tricycle rectangular frame information corresponding to the current frame. The tricycle rectangular frame with track _ id on the unmatched is marked as lost, and if the tricycle rectangular frame in the lost state is not matched for more than 30 frames, the tricycle rectangular frame is deleted.
And dividing the tricycle picture collected in the step S1 into a training set and a verification set according to the ratio of 9:1, and performing model training and verification. And comparing the confidence of the tracked tricycle rectangular frame with a set detection threshold of the tricycle, and when the confidence is greater than the threshold, determining that the tricycle rectangular frame is valid.
The horizontal and vertical coordinates, the width and the height of the upper left corner of the rectangular frame of different tricycles and the serial numbers of the tricycles can be obtained through the operation.
S3: and acquiring an image area containing the tricycle according to the coordinates of the rectangular frame of the tricycle in the image.
In the present embodiment, it is assumed that coordinates of a rectangular frame of the tricycle in the image are (x, y, w, h), where x and y are respectively an abscissa and an ordinate of an upper left corner, and w and h are respectively a width and a height of the rectangular frame; setting the value of the externally expanded pixel as p; the width and height of the input image are W and H, respectively. Obtained by calculatingCoordinate (x) of upper left corner of rectangular frame of tricycle after external expansionT_tl,yT_tl) And the coordinates of the lower right corner (x)T_br,yT_br):
xT_tlX-p, if xT_tl<0, then xT_tl=0,
yT_tlIf y-pT_tl<0, then yT_tl=0,
xT_brX + w + p, if xT_brNot less than W, then xT_br=W-1,
yT_brY + h + p, if yT_brNot less than H, then yT_br=H-1
And positioning the processed coordinates of the tricycle into the input image to obtain and store an image area of the tricycle, screening the stored tricycle images, and collecting 4000 tricycle images including manned and unmanned images of the tricycle.
S4: and constructing a human body detection model on the tricycle, and inputting the image area containing the tricycle into the human body detection model to obtain the coordinates of the human body rectangular frame.
In this embodiment, a labeling tool is used to label the human body in the tricycle image collected in S3 in a rectangular frame format, and convert the representation form of the labeled human body rectangular frame into a (class _ id, x y, w, h) format, where class _ id is the category number of the human body, x is the abscissa of the normalized center point, y is the ordinate of the normalized center point, w is the width of the normalized rectangular frame, and h is the height of the normalized rectangular frame. The yolov3 network is used as a human body detection network on the tricycle, and the detection results of three different scales are fused to determine the final output result. The output dimension of the network is 6A × H × W, a being the number of anchors per yolo layer, set here to a ═ 3; h and W are the height and width of the corresponding output image, respectively; and 6A represents that each anchor corresponds to 6 parameters which are respectively a category id, a confidence coefficient, and a horizontal ordinate (x, y) of the upper left corner and a horizontal ordinate (x, y) of the lower right corner of the human body rectangular frame.
And dividing the tricycle picture collected in the step S3 into a training set and a verification set according to the ratio of 9:1, and performing model training and verification. And comparing the confidence coefficient of the human body rectangular frame with a set human body detection threshold, and when the confidence coefficient is greater than the threshold, determining that the human body rectangular frame is effective.
The horizontal and vertical coordinates of the upper left corner and the lower right corner of the rectangular frame of the human body on the tricycle can be obtained through the operation.
S5: whether the tricycle carries people is judged by judging the position relation between the coordinates of the human body rectangular frame and the coordinates of the corresponding tricycle rectangular frame, and the method specifically comprises the following steps of: mapping the coordinates of the human body rectangular frame to an original input image; converting the coordinates of the rectangular frame of the corresponding tricycle into coordinates of an upper left corner and a lower right corner; when the horizontal and vertical coordinates of the lower right corner of the human body rectangular frame are respectively between the horizontal and vertical coordinates of the upper left corner and the lower right corner of the corresponding tricycle, adding one to the number of human bodies on the corresponding tricycle; if the number of the human bodies on the corresponding tricycle is more than or equal to 2, the tricycle is considered to carry people; otherwise, the tricycle is considered to be not carrying a person.
In this embodiment, a rectangular frame of a human body output from a human body detection model is mapped into an original input image through coordinate transformation, assuming that the mapped coordinate form is (x)P_tl,yP_tl,xP_br,yP_br) (ii) a The coordinates of the rectangular frame of the tricycle obtained in S2 are converted into the coordinates of the upper left corner and the lower right corner, assuming that the converted coordinates are in the form of (x)T_tl,yT_tl,xT_br,yT_br). Suppose the number of human bodies on a tricycle is countIDInitially 0. If the following relationship exists between the positions of the corresponding ID tricycles and the human body rectangular frame on the tricycle: x is the number ofT_tl<xP_br≤xT_brAnd y isT_tl<yP_br≤yT_brThen countID=countID+1. If countIDIf the number is more than or equal to 2, the tricycle with the ID number is considered to be manned, otherwise, the tricycle is considered to be unmanned.
S6: and setting a rectangular frame area A perpendicular to the running direction of the tricycle flow, and counting the tricycle flow by judging whether the tricycle passes through a certain side of the area A perpendicular to the flow direction.
In the present embodiment, the coordinates of the rectangular area a are determined to be leftCoordinates of upper corner and lower right corner (x)A_tl,yA_tl,xA_br,yA_br) Let the coordinates of the upper left corner and the lower right corner of the rectangular frame of the tricycle be (x)T_tl,yT_tl,xT_br,yT_br). Assuming the flow rate of the tricycle is ftricycleThe total number of tricycles passing through the area A in a certain short time t is NtricycleInitially 0. If a tricycle has the following relationship with the L1 side of area A: y isT_tl≤yA_tl<yT_brAnd the following relationships exist with the L2 and L3 sides of the region A: x is the number ofA_tl≤xT_tl<xA_brOr xA_tl<xT_br≤xA_brThen N istricycle=Ntricycle+1 Tricycle flow through zone A is ftricycle=Ntricycle/t。
This embodiment correspondingly still provides a tricycle flow statistics and manned system of judging, includes:
the data collection module is used for acquiring image data for monitoring the traffic scene;
in the embodiment, video data of traffic scenes at 7 intersections are acquired, and 20000 images containing three cars are collected and recorded as a first data set by performing frame extraction, sampling, renaming and deleting image frames not containing three cars on the video; and labeling the tricycle target in a rectangular frame format by using a labeling tool, and converting the representation form of the labeled tricycle rectangular frame into a (class _ id, x y, w, h) format, wherein the class _ id is the category number of the tricycle, x is the horizontal coordinate of the normalized central point, y is the vertical coordinate of the normalized central point, w is the width of the normalized rectangular frame, and h is the height of the normalized rectangular frame.
The tricycle rectangular frame and number acquisition module inputs the image data into a target tracking model to acquire the coordinates of the tricycle rectangular frame in the image and the number of the tricycle;
in the embodiment, a target tracking model of the tricycle is constructed by using a target tracking algorithm consisting of a one-stage target detection network, feature matching, detection frame IOU (cross-over ratio) matching and Kalman filtering. The original output of the network is changed into target box coordinates, target confidence, target class and target feature vector by associating a set of convolution layers (3 × 3 convolution, batch normalization layer, activation function layer) at each output level of the YOLO series network target detection network (YOLOv3, YOLOv4, YOLOv5, etc.). Specifically, the YOLO series network is mainly composed of a backbone network and a plurality of YOLO layers. If the extracted feature dimension of the tricycle is D, the output dimension of the network is (6A + D) multiplied by H multiplied by W, A is the number of anchors corresponding to each yolo layer, and the number of anchors is set to be 4; h and W are respectively the height and width of the image output by the corresponding yolo layer; and 6A represents that each anchor corresponds to 6 parameters, namely the category id, the confidence coefficient, the horizontal and vertical coordinates (x, y) of the upper left corner of the rectangular frame of the tricycle, and the width w and the height h of the rectangular frame.
The initialization of track _ id (namely the number of different tricycles) and the initialization of Kalman parameters are carried out on the tricycle rectangular frame of the last frame of image output from the YOLO series network. And calculating characteristic distances between the tricycle rectangular frame output by the current frame image from the YOLO series network and the tricycle rectangular frame with the track _ id of the previous frame, matching the characteristic distances, and updating the tricycle rectangular frame with the track _ id of the previous frame matched with the tricycle rectangular frame with the track _ id of the current frame by adopting the tricycle rectangular frame information corresponding to the current frame. And calculating the IOU distance between the tricycle rectangular frame with the track _ id of the previous frame which is not matched and the tricycle rectangular frame which is not matched with the current frame, matching, and updating the tricycle rectangular frame with the track _ id of the previous frame which is matched with the tricycle rectangular frame with the track _ id of the current frame by adopting the tricycle rectangular frame information corresponding to the current frame. The tricycle rectangular frame with track _ id on the unmatched is marked as lost, and if the tricycle rectangular frame in the lost state is not matched for more than 30 frames, the tricycle rectangular frame is deleted.
And dividing the collected tricycle picture of the first data set into a training set and a verification set according to a ratio of 9:1, and performing model training and verification. And comparing the confidence of the tracked tricycle rectangular frame with a set detection threshold of the tricycle, and when the confidence is greater than the threshold, determining that the tricycle rectangular frame is valid.
The horizontal and vertical coordinates, the width and the height of the upper left corner of the rectangular frame of different tricycles and the serial numbers of the tricycles can be obtained through the operation.
The tricycle image area acquisition module is used for acquiring an image area containing the tricycle according to the coordinates of a rectangular frame of the tricycle in an image;
in the present embodiment, it is assumed that coordinates of a rectangular frame of the tricycle in the image are (x, y, w, h), where x and y are respectively an abscissa and an ordinate of an upper left corner, and w and h are respectively a width and a height of the rectangular frame; setting the value of the externally expanded pixel as p; the width and height of the input image are W and H, respectively. The coordinate (x) of the upper left corner of the rectangular frame of the tricycle after the external expansion can be obtained by the following calculationT_tl,yT_tl) And the coordinates of the lower right corner (x)T_br,yT_br):
xT_tlX-p, if xT_tl<0, then xT_tl=0,
yT_tlIf y-pT_tl<0, then yT_tl=0,
xT_brX + w + p, if xT_brNot less than W, then xT_br=W-1,
yT_brY + h + p, if yT_brNot less than H, then yT_br=H-1
And positioning the processed coordinates of the tricycle into the input image to obtain and store an image area of the tricycle, screening the stored tricycle images, collecting 4000 tricycle images, and recording as a second data set, wherein the 4000 tricycle images comprise images of persons carrying the tricycle and images of persons not carrying the tricycle.
The human body rectangular frame acquisition module on the tricycle constructs a human body detection model on the tricycle, and the image area containing the tricycle is input into the human body detection model to obtain the coordinates of the human body rectangular frame;
in this embodiment, a labeling tool is used to label the human body in the collected tricycle image in the second data set in a rectangular frame format, and convert the representation form of the labeled human body rectangular frame into a (class _ id, x y, w, h) format, where class _ id is the category number of the human body, x is the abscissa of the normalized center point, y is the ordinate of the normalized center point, w is the width of the normalized rectangular frame, and h is the height of the normalized rectangular frame. The yolov3 network is used as a human body detection network on the tricycle, and the detection results of three different scales are fused to determine the final output result. The output dimension of the network is 6A × H × W, a being the number of anchors per yolo layer, set here to a ═ 3; h and W are respectively the height and width of the corresponding output image; and 6A represents that each anchor corresponds to 6 parameters which are respectively a category id, a confidence coefficient, and a horizontal ordinate (x, y) of the upper left corner and a horizontal ordinate (x, y) of the lower right corner of the human body rectangular frame.
And dividing the collected tricycle pictures in the second data set into a training set and a verification set according to the ratio of 9:1, and performing model training and verification. And comparing the confidence of the human body rectangular frame with a set human body detection threshold, and when the confidence is greater than the threshold, determining that the human body rectangular frame is an effective human body rectangular frame.
The horizontal and vertical coordinates of the upper left corner and the lower right corner of the rectangular frame of the human body on the tricycle can be obtained through the operation.
And the module for judging whether the tricycle carries people judges whether the tricycle carries people or not by judging the position relation between the coordinates of the human body rectangular frame and the coordinates of the corresponding tricycle rectangular frame.
In this embodiment, a rectangular frame of a human body output from a human body detection model is mapped into an original input image through coordinate transformation, assuming that the mapped coordinate form is (x)P_tl,yP_tl,xP_br,yP_br) (ii) a Converting the obtained coordinates of the rectangular frame of the tricycle into coordinates of an upper left corner and a lower right corner, and assuming that the converted coordinates are in a form of (x)T_tl,yT_tl,xT_br,yT_br). Suppose the number of human bodies on a tricycle is countIDIf the initial value is 0, the following relationship exists between the corresponding ID tricycle and the position of the human body rectangular frame on the tricycle: x is a radical of a fluorine atomT_tl<xP_br<xT_brAnd y isT_tl<yP_br<yT_brThen countID=countID+1. If countIDIf the number is more than or equal to 2, the tricycle with the ID number is considered to be carrying the person, otherwise, the tricycle is considered to be not carrying the person.
The tricycle flow counting module is provided with a rectangular frame area A perpendicular to the tricycle flow running direction, and counts the tricycle flow by judging whether the tricycle passes through a certain side of the area A perpendicular to the tricycle flow direction.
In the present embodiment, the coordinates of the rectangular area a are determined as the upper left corner coordinate and the lower right corner coordinate (x)A_tl,yA_tl,xA_br,yA_br) Let the coordinates of the upper left corner and the lower right corner of the rectangular frame of the tricycle be (x)T_tl,yT_tl,xT_br,yT_br). Assuming the flow rate of the tricycle is ftricycleThe total number of tricycles passing through the area A in a certain short time t is NtricycleInitially 0. If a tricycle has the following relationship with the L1 side of area A: y isT_tl≤yA_tl<yT_brAnd the following relationships exist with the L2 and L3 sides of the region A: x is a radical of a fluorine atomA_tl≤xT_tl<xA_brOr xA_tl<xT_br≤xA_brThen N istricycle=Ntricycle+1 tricycle flow through zone A is ftricycle=Ntricycle/t。
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for flow statistics and manned judgment of a tricycle is characterized by comprising the following steps:
s1: acquiring image data for monitoring a traffic scene;
s2: constructing a target tracking model of the tricycle, inputting the image data into the target tracking model, and obtaining the coordinates of a rectangular frame of the tricycle in the image and the ID number of the tricycle;
s3: acquiring an image area containing the tricycle according to the coordinates of the rectangular frame of the tricycle in the image;
s4: constructing a human body detection model on the tricycle, and inputting the image area containing the tricycle into the human body detection model to obtain the coordinates of a human body rectangular frame;
s5: whether the tricycle carries people is judged by judging the position relation between the coordinates of the human body rectangular frame and the coordinates of the corresponding tricycle rectangular frame, and the method specifically comprises the following steps of: mapping the coordinates of the human body rectangular frame to an original input image; converting the coordinates of the rectangular frame of the corresponding tricycle into coordinates of an upper left corner and an lower right corner; when the horizontal and vertical coordinates of the lower right corner of the human body rectangular frame are respectively between the horizontal and vertical coordinates of the upper left corner and the lower right corner of the corresponding tricycle, adding one to the number of human bodies on the corresponding tricycle; if the number of the human bodies on the corresponding tricycle is more than or equal to 2, the tricycle is considered to carry people; otherwise, the tricycle is considered to be not carrying a person;
s6: and setting a rectangular frame area A perpendicular to the running direction of the tricycle flow, and counting the tricycle flow by judging whether the tricycle passes through a certain side of the area A perpendicular to the flow direction.
2. The method for traffic statistics and passenger judgment of a tricycle according to claim 1, wherein the step 2 is to construct a target tracking model of the tricycle, and the coordinates of a rectangular frame of the tricycle in the image and the ID number of the tricycle specifically include: (1) manufacturing a tricycle detection data set: performing frame extraction, sampling and deleting on video data acquired from a traffic scene, wherein the video data do not contain an image frame of a tricycle, marking the position of the tricycle in a data set by using a rectangular frame, and giving a category of the tricycle; (2) connecting the final output of a group of convolutional layer improved networks at each output level of the YOLO series target detection network into target frame coordinates, target confidence, target category and target characteristic vectors; (3) training a tricycle detection model by adopting an improved YOLO series network according to a tricycle detection data set; (4) obtaining a target tracking model of the tricycle by combining a detection model, a Kalman filtering algorithm and a matching algorithm of the tricycle; (5) coordinates of a rectangular frame of the tricycle in the image, which are acquired by the detection model of the tricycle, are a horizontal coordinate, a vertical coordinate, a width and a height of the upper left corner of the tricycle; (6) and comparing the confidence coefficient of the rectangular frame with a set detection threshold value of the tricycle, and when the confidence coefficient of the rectangular frame is greater than the threshold value, determining that the tricycle is a valid tricycle rectangular frame, distinguishing different tricycles through a tracking model, and assigning different ID numbers to each valid tricycle rectangular frame.
3. The method for tricycle flow statistics and manned determination according to claim 1, wherein the step 3 includes an image area of the tricycle, and specifically includes: (1) respectively subtracting a settable external expansion pixel value from the horizontal coordinate and the vertical coordinate of the upper left corner of the rectangular frame of the tricycle obtained by the target tracking model to obtain the coordinate value of the upper left corner of the tricycle image area; (2) adding the extended pixel value and the width and height of the tricycle rectangular frame to the horizontal coordinate and the vertical coordinate of the upper left corner of the tricycle rectangular frame obtained by the target tracking model to obtain the coordinate value of the lower right corner of the tricycle image area; (3) and intercepting and storing the image area of the tricycle from the original input image by adopting the coordinate value of the upper left corner and the coordinate value of the lower right corner of the image area of the tricycle.
4. The method for tricycle flow statistics and manned judgment according to claim 1, wherein the step 4 of constructing the human body detection model on the tricycle, the coordinates of the human body rectangular frame specifically include: (1) and manufacturing a human body detection data set according to the intercepted and stored image area of the tricycle: marking the position of a human body on the tricycle by using a rectangular frame and giving a category of the human body; (2) training a human body detection model by adopting a deep convolution neural network according to a human body detection data set; (3) coordinates of a rectangular frame of the human body, which are acquired by the human body detection model, are a horizontal and vertical coordinate at the upper left corner and a horizontal and vertical coordinate at the lower right corner; (4) and comparing the confidence coefficient of the rectangular frame with a set human body detection threshold, and when the confidence coefficient of the rectangular frame is greater than the threshold, determining that the rectangular frame is an effective human body rectangular frame.
5. The method for tricycle flow statistics and manned determination according to claim 1, wherein the step 6 of calculating the tricycle flow by determining whether the tricycle passes through a certain side of the area a perpendicular to the direction of the tricycle flow specifically comprises: (1) one of two sides of the area A vertical to the traffic flow direction and close to the tricycle is marked as L1, and two sides of the area A parallel to the traffic flow direction are respectively marked as L2 and L3; (2) and judging whether the ordinate of the upper left corner and the ordinate of the lower right corner of the rectangular frame of a certain tricycle are on two sides of the L1 and whether the abscissa of the upper left corner or the abscissa of the lower right corner of the rectangular frame of the tricycle is between the L2 and the L3, if the ordinate and the abscissa of the lower left corner are on two sides of the rectangular frame of the tricycle, storing the ID number and the coordinate frame information of the tricycle and carrying out tricycle flow statistics.
6. The utility model provides a system that tricycle flow statistics and manned judge which characterized in that includes:
the data collection module is used for acquiring image data for monitoring the traffic scene;
the tricycle rectangular frame and number acquisition module inputs the image data into a target tracking model to acquire the coordinates of the tricycle rectangular frame in the image and the number of the tricycle;
the tricycle image area acquisition module is used for acquiring an image area containing the tricycle according to the coordinates of a rectangular frame of the tricycle in an image;
the human body rectangular frame acquisition module on the tricycle constructs a human body detection model on the tricycle, and the image area containing the tricycle is input into the human body detection model to obtain the coordinates of the human body rectangular frame;
the tricycle manned judging module judges whether the tricycle is manned or not by judging the position relation between the coordinates of the human body rectangular frame and the coordinates of the corresponding tricycle rectangular frame;
the tricycle flow counting module is provided with a rectangular frame area A perpendicular to the tricycle flow running direction, and counts the tricycle flow by judging whether the tricycle passes through a certain side of the area A perpendicular to the tricycle flow direction.
7. The system for tricycle flow statistics and manned determination according to claim 6, wherein the rectangular frame and number acquisition module of the tricycle specifically comprises: (1) manufacturing a tricycle detection data set: performing frame extraction, sampling and deleting on video data acquired from a traffic scene, wherein the video data do not contain an image frame of a tricycle, marking the position of the tricycle in a data set by using a rectangular frame, and giving a category of the tricycle; (2) connecting the final output of a group of convolutional layer improved networks at each output level of the YOLO series target detection network into target frame coordinates, target confidence, target category and target characteristic vectors; (3) training a tricycle detection model by adopting an improved YOLO series network according to a tricycle detection data set; (4) obtaining a target tracking model of the tricycle by combining a detection model, a Kalman filtering algorithm and a matching algorithm of the tricycle; (5) coordinates of a rectangular frame of the tricycle in the image, which are acquired by the detection model of the tricycle, are a horizontal coordinate, a vertical coordinate, a width and a height of the upper left corner of the tricycle; (6) and comparing the confidence coefficient of the rectangular frame with a set detection threshold value of the tricycle, and when the confidence coefficient of the rectangular frame is greater than the threshold value, determining that the tricycle is a valid tricycle rectangular frame, distinguishing different tricycles through a tracking model, and assigning different ID numbers to each valid tricycle rectangular frame.
8. The system for tricycle flow statistics and manned determination according to claim 6, wherein the module for acquiring the tricycle image area specifically comprises: (1) respectively subtracting a settable external expansion pixel value from the horizontal coordinate and the vertical coordinate of the upper left corner of the rectangular frame of the tricycle obtained by the target tracking model to obtain the coordinate value of the upper left corner of the tricycle image area; (2) adding the extended pixel value and the width and height of the tricycle rectangular frame to the horizontal coordinate and the vertical coordinate of the upper left corner of the tricycle rectangular frame obtained by the target tracking model to obtain the coordinate value of the lower right corner of the tricycle image area; (3) and intercepting and storing the image area of the tricycle in the original input image by adopting the coordinate value of the upper left corner and the coordinate value of the lower right corner of the image area of the tricycle.
9. The system for tricycle flow statistics and manned determination according to claim 6, wherein the module for determining whether a tricycle is manned specifically comprises: mapping the coordinates of the human body rectangular frame to an original input image; converting the coordinates of the rectangular frame of the corresponding tricycle into coordinates of an upper left corner and an lower right corner; when the horizontal and vertical coordinates of the lower right corner of the human body rectangular frame are respectively between the horizontal and vertical coordinates of the upper left corner and the lower right corner of the corresponding tricycle, adding one to the number of human bodies on the corresponding tricycle; if the number of the human bodies on the corresponding tricycle is more than or equal to 2, the tricycle is considered to carry people; otherwise, the tricycle is considered to be not carrying a person.
10. The system for tricycle flow statistics and manned determination according to claim 6, wherein the tricycle flow statistics module specifically comprises: (1) one of two sides of the area A vertical to the traffic flow direction and close to the tricycle is marked as L1, and two sides of the area A parallel to the traffic flow direction are respectively marked as L2 and L3; (2) and judging whether the ordinate of the upper left corner and the ordinate of the lower right corner of the rectangular frame of a certain tricycle are on two sides of the L1 and whether the abscissa of the upper left corner or the abscissa of the lower right corner of the rectangular frame of the tricycle is between the L2 and the L3, if the ordinate and the abscissa of the lower left corner are on two sides of the rectangular frame of the tricycle, storing the ID number and the coordinate frame information of the tricycle and carrying out tricycle flow statistics.
CN202210177738.XA 2022-02-24 2022-02-24 Method and system for flow statistics and manned judgment of tricycle Pending CN114550094A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210177738.XA CN114550094A (en) 2022-02-24 2022-02-24 Method and system for flow statistics and manned judgment of tricycle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210177738.XA CN114550094A (en) 2022-02-24 2022-02-24 Method and system for flow statistics and manned judgment of tricycle

Publications (1)

Publication Number Publication Date
CN114550094A true CN114550094A (en) 2022-05-27

Family

ID=81679296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210177738.XA Pending CN114550094A (en) 2022-02-24 2022-02-24 Method and system for flow statistics and manned judgment of tricycle

Country Status (1)

Country Link
CN (1) CN114550094A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115410155A (en) * 2022-08-31 2022-11-29 珠海数字动力科技股份有限公司 Pedestrian flow statistical method based on multi-target tracking

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115410155A (en) * 2022-08-31 2022-11-29 珠海数字动力科技股份有限公司 Pedestrian flow statistical method based on multi-target tracking

Similar Documents

Publication Publication Date Title
CN111368687B (en) Sidewalk vehicle illegal parking detection method based on target detection and semantic segmentation
Azimi et al. Aerial LaneNet: Lane-marking semantic segmentation in aerial imagery using wavelet-enhanced cost-sensitive symmetric fully convolutional neural networks
CN109977812B (en) Vehicle-mounted video target detection method based on deep learning
Mithun et al. Detection and classification of vehicles from video using multiple time-spatial images
Zheng et al. A novel vehicle detection method with high resolution highway aerial image
CN109902676B (en) Dynamic background-based violation detection algorithm
CN115717894B (en) Vehicle high-precision positioning method based on GPS and common navigation map
CN113947766B (en) Real-time license plate detection method based on convolutional neural network
CN114677507A (en) Street view image segmentation method and system based on bidirectional attention network
CN114170580A (en) Highway-oriented abnormal event detection method
CN116824859B (en) Intelligent traffic big data analysis system based on Internet of things
CN111127520B (en) Vehicle tracking method and system based on video analysis
CN106980855A (en) Traffic sign quickly recognizes alignment system and method
CN114663852A (en) Method and device for constructing lane line graph, electronic equipment and readable storage medium
CN112836657A (en) Pedestrian detection method and system based on lightweight YOLOv3
WO2024046053A1 (en) Vehicle violation detection method, apparatus and system, and storage medium
CN113903008A (en) Ramp exit vehicle violation identification method based on deep learning and trajectory tracking
CN115424217A (en) AI vision-based intelligent vehicle identification method and device and electronic equipment
CN114550094A (en) Method and system for flow statistics and manned judgment of tricycle
CN114898243A (en) Traffic scene analysis method and device based on video stream
Kamenetsky et al. Aerial car detection and urban understanding
CN114820931B (en) Virtual reality-based CIM (common information model) visual real-time imaging method for smart city
CN112950954B (en) Intelligent parking license plate recognition method based on high-position camera
CN114639084A (en) Road side end vehicle sensing method based on SSD (solid State disk) improved algorithm
CN112270232A (en) Method and device for classifying weak traffic participants around vehicle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination