CN110610159A - Real-time bus passenger flow volume statistical method - Google Patents

Real-time bus passenger flow volume statistical method Download PDF

Info

Publication number
CN110610159A
CN110610159A CN201910869554.8A CN201910869554A CN110610159A CN 110610159 A CN110610159 A CN 110610159A CN 201910869554 A CN201910869554 A CN 201910869554A CN 110610159 A CN110610159 A CN 110610159A
Authority
CN
China
Prior art keywords
feature
feature mapping
convolution
passenger flow
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910869554.8A
Other languages
Chinese (zh)
Inventor
靳展
章国泰
钟明旸
王红广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Card Intelligent Network Polytron Technologies Inc
Original Assignee
Tianjin Card Intelligent Network Polytron Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Card Intelligent Network Polytron Technologies Inc filed Critical Tianjin Card Intelligent Network Polytron Technologies Inc
Priority to CN201910869554.8A priority Critical patent/CN110610159A/en
Publication of CN110610159A publication Critical patent/CN110610159A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a real-time bus passenger flow rate statistical method. The method aims to detect and track the head in a video image. The video data of the method is passenger flow video data shot from the top areas of the front door and the rear door of the bus by using a camera, and the statistical method comprises the following steps: front-end data acquisition, model training, head tracking and passenger flow counting; compared with the prior art, the invention has the advantages that: the method utilizes a deep learning method to detect the head, has real-time performance, and can efficiently and accurately detect the passenger flow.

Description

Real-time bus passenger flow volume statistical method
The technical field is as follows:
the invention relates to the technical field of image processing in pattern recognition, and further relates to a real-time bus passenger flow volume statistical method.
Background art:
the infrared device and pressure sensor method is a technique for making statistics of passenger flow, and the detection error is large, so that the method is not used until now and is gradually abandoned. In recent years, with the continuous development of the deep learning field and the GPU parallel computing field, the computer vision direction is rapidly developed. The passenger flow statistics is an important application in the field of image processing, and is also a new field and direction in the current intelligent video monitoring.
In recent years, passenger flow statistics methods are mainly based on three main categories: the detection method is based on feature points, human body segmentation and tracking and deep learning. The accuracy of the detection method based on the feature points and based on the human body segmentation and tracking needs to be improved. As hardware conditions are continuously improved and popularized, the detection method based on deep learning is more and more emphasized and popularized.
The invention content is as follows:
the invention aims to provide a method for detecting the human head in a video image and realize end-to-end training. Such a method should be able to detect the passenger flow efficiently and accurately. The specific technical scheme is as follows:
the video data of the method is passenger flow video data shot from the top areas of the front door and the rear door of the bus by using a camera, and the statistical method comprises the following steps:
step 1: front-end data acquisition:
step 1.1: video acquisition: vertically installing a camera right above a bus door, and acquiring an image video of passengers getting on and off the bus;
step 1.2: video framing, namely framing the video and dividing the video into 640 × 480 RGB three-channel images;
step 1.3: image zooming: scaling each frame of image into 224 × 224 data;
step 2: training a model;
step 2.1: and (3) feature calculation: extracting the characteristics of the image by using a convolution kernel of 3 x 3; then, in order to reduce the calculation time of the features, the original convolution operation is changed into a depth separable convolution operation, a series of mathematical transformations are carried out on the feature mapping by utilizing normalized Batch-Normalization and a nonlinear activation function Relu, and the dimensionality of the feature mapping is reduced by times by utilizing a maximum value downsampling maxporoling method; performing convolution on the feature mapping by using convolution with the size of 1 × 1 and the step length of 1, and performing mathematical transformation on the feature mapping by using normalized Batch-Normalization and a nonlinear activation function Relu to ensure the number of channels of the feature mapping;
step 2.2: feature extraction: because the target sizes are different, feature mappings of different dimensions need to be extracted;
step 2.3: extracting an anchor frame: on feature maps of different dimensions; selecting four frames with different scales by taking each feature point as a center; taking all frames as candidate frames of target classification and regression;
step 2.4: and (3) target classification: carrying out convolution on the feature mapping by utilizing the depth separable convolution, obtaining the prediction score of the candidate box obtained in the step 2.3 for each category, comparing the prediction score with the ground truth value, and calculating the cross entropy of the candidate box and the ground truth value to obtain the classification loss; optimizing the parameters of the network by using a random gradient descent method until the classification loss reaches a specified range;
step 2.5: target regression: performing convolution on the feature mapping by using depth separable convolution, obtaining the central point and width and height predicted values of the candidate frame obtained in the step 2.3, comparing the predicted values with ground truth values, and calculating linear regression functions of the predicted values and the ground truth values to obtain linear regression loss; optimizing the parameters of the network by using a random gradient descent method until the linear regression loss reaches a specified range;
and step 3: head detection;
step 3.1: and (3) feature calculation: extracting the characteristics of the image by using a convolution kernel of 3 x 3; then, in order to reduce the calculation time of the features, the original convolution operation is changed into a depth separable convolution operation, the feature mapping is subjected to mathematical transformation by utilizing normalized Batch-Normalization and a nonlinear activation function Relu, and the dimensionality of the feature mapping is reduced in a multiplied mode by utilizing a maximum value down-sampling maxporoling method; performing convolution on the feature mapping by using convolution with the size of 1 × 1 and the step length of 1, and performing mathematical transformation on the feature mapping by using normalized Batch-Normalization and a nonlinear activation function Relu to ensure the number of channels of the feature mapping;
step 3.2: feature extraction: because the target sizes are different, feature mappings of different dimensions need to be extracted;
step 3.3: extracting an anchor frame: on feature maps of different dimensions; selecting four frames with different scales by taking each feature point as a center; taking all frames as candidate frames of target classification and regression;
step 3.4: and (3) target classification: convolving the feature mapping by using depth separable convolution, obtaining the prediction score of the candidate frame obtained in the step 2.3 for each category, and screening the candidate frame by using an interaction over Union (Ious) and non-maximum suppression (NMS) method;
step 3.5: target regression: performing convolution on the feature mapping by using depth separable convolution, obtaining a central point and a width and height predicted value of the candidate frame obtained in the step 2.3, and screening the candidate frame by using an interaction over Union (Ious) and non-maximum suppression (NMS) method;
and 4, step 4: tracking the head: the kernel filter method tracks the detection window and forms a track, and if the track exceeds a specified limit, the passenger finishes the action of getting on or off the vehicle;
a tracking head stage:
and 5: and (3) passenger flow volume counting: if the passenger finishes the action of getting on/off the bus, the algorithm updates the data added by 1 for the passenger flow, otherwise, the passenger flow is kept unchanged;
as one of the preferable schemes, the specific process of step 2.1 is as follows: firstly, carrying out convolution kernel with the size of 3 x 3 and the step length of 1 on an RGB image, then carrying out mathematical transformation on feature mapping by utilizing normalized Batch-Normalization and a nonlinear activation function Relu, and carrying out down-sampling on the feature mapping by utilizing a maximum value down-sampling maxporoling method; then, performing convolution on the feature mapping by using depth separable convolution with the size of 3 × 3 and the step size of 1, performing mathematical transformation on the feature mapping by using normalized Batch-Normalization and a nonlinear activation function Relu, and performing downsampling on the feature mapping by using a maximum downsampling method; and (3) performing convolution on the feature mapping by using the convolution with the size of 1 × 1 and the step size of 1, and performing mathematical transformation on the feature mapping by using normalized Batch-Normalization and a nonlinear activation function Relu to ensure the number of channels of the feature mapping.
As a second preferred solution, the selection method of the feature mapping in step 2.2 is as follows: all the different feature mappings need to be selected due to the different sizes of the targets; for small targets, a larger feature mapping needs to be selected for classification and regression; for large targets, a smaller feature map needs to be selected for classification and regression.
As a third preferred scheme, in step 2.3, the candidate frame selection method comprises: selecting four frames with different scales on feature mapping with different dimensions by taking each feature point as a center; this allows the entire area of the image to be covered to avoid missing portions.
As a fourth preferred solution, in step 2.4, the cross entropy loss is calculated as follows:
yithe real value of the ground is shown,indicates the predicted value, n is the number of candidate frames
As a fifth preferred embodiment, the method further comprises step 2.5, and the linear regression loss is calculated as follows:
xi,yi,wi,hithe real value of the ground is shown,and expressing a predicted value, wherein alpha is a coefficient, alpha needs to be selected to be a proper value according to a specific scene, and n is the number of candidate frames.
As a further preferable embodiment of the fifth preferable embodiment, the overall loss L is further obtained in the step 2.5, and the calculation manner of L is as follows:
L=Lloc+βLreg
wherein β is a coefficient;
and optimizing the parameters of the network by using a random gradient descent method until the total loss reaches a specified range.
As a sixth preferred embodiment, the method further comprises the step 6: storing the coordinates and width and height of the passenger head detected by each frame of image into a file of a detection result, and storing each frame of image; when the bus stops running, the method can output the final passenger flow.
Compared with the prior art, the invention has the advantages that:
the detection method has real-time performance and can efficiently and accurately detect the passenger flow.
Depth separable convolution (II) can speed up the time of feature extraction.
And (III) in the model training process, end-to-end training can be realized, and all training can be completed on the GPU.
And (IV) maximum value sampling maxporoling is used for replacing the step size to be 2, so that the loss of detail information can be avoided, and the smoothness of the target is improved.
Description of the drawings:
fig. 1 is a flow chart of a bus passenger flow volume statistical method in the embodiment of the invention.
Fig. 2 is a schematic flow chart of a convolution module in the embodiment of the present invention.
FIG. 3 is a flow chart illustrating a depth separable convolution according to an embodiment of the present invention.
The specific implementation mode is as follows:
example (b):
a real-time bus passenger flow volume statistical method is characterized in that video data of the method are passenger flow video data shot from top areas of a front door and a rear door of a bus by using cameras, and the statistical method comprises the following steps:
the video data of the method is passenger flow video data shot from the top areas of the front door and the rear door of the bus by using a camera, and the statistical method comprises the following steps:
step 1: front-end data acquisition:
step 1.1: video acquisition: vertically installing a camera right above a bus door, and acquiring an image video of passengers getting on and off the bus;
step 1.2: video framing, namely framing the video and dividing the video into 640 × 480 RGB three-channel images;
step 1.3: image zooming: scaling each frame of image into 224 × 224 data;
step 2: completing model training on a CPU or a GPU;
step 2.1: and (3) feature calculation: firstly, carrying out convolution kernel with the size of 3 x 3 and the step length of 1 on an RGB image, then carrying out mathematical transformation on feature mapping by utilizing normalized Batch-Normalization and a nonlinear activation function Relu, and carrying out down-sampling on the feature mapping by utilizing a maximum value down-sampling maxporoling method; then, performing convolution on the feature mapping by using depth separable convolution with the size of 3 × 3 and the step size of 1, performing mathematical transformation on the feature mapping by using normalized Batch-Normalization and a nonlinear activation function Relu, and performing downsampling on the feature mapping by using a maximum downsampling method; performing convolution on the feature mapping by using convolution with the size of 1 × 1 and the step length of 1, and performing mathematical transformation on the feature mapping by using normalized Batch-Normalization and a nonlinear activation function Relu to ensure the number of channels of the feature mapping;
step 2.2: feature extraction: because the target sizes are different, feature mappings of different dimensions need to be extracted; for small targets, a larger feature mapping needs to be selected for classification and regression; for a large target, a smaller feature mapping needs to be selected for classification and regression;
step 2.3: extracting an anchor frame: selecting four frames with different scales on feature mapping with different dimensions by taking each feature point as a center; thus, all areas of the image can be completely covered to avoid missed detection;
step 2.4: and (3) target classification: carrying out convolution on the feature mapping by utilizing the depth separable convolution, obtaining the prediction score of the candidate box obtained in the step 2.3 for each category, comparing the prediction score with the ground truth value, and calculating the cross entropy of the candidate box and the ground truth value to obtain the classification loss; optimizing the parameters of the network by using a random gradient descent method until the classification loss reaches a specified range; the cross entropy loss is calculated as follows:
yithe real value of the ground is shown,representing a predicted value, wherein n is the number of candidate frames;
step 2.5: target regression: performing convolution on the feature mapping by using depth separable convolution, obtaining the central point and width and height predicted values of the candidate frame obtained in the step 2.3, comparing the predicted values with ground truth values, and calculating linear regression of the candidate frame and the ground truth values to obtain linear regression loss; the linear regression loss was calculated as follows:
xi,yi,wi,hithe real value of the ground is shown,representing a predicted value, wherein alpha is a coefficient, alpha is 0.5 in the scene, other scenes need to select a proper numerical value according to a specific scene, and n is the number of candidate frames;
the overall loss function of the model is calculated as follows:
L=Lloc+βLreg
wherein β is a coefficient, β is 0.25 in the present scenario, and other scenarios need to select a suitable value according to a specific scenario;
optimizing the parameters of the network by using a random gradient descent method until the total loss reaches a specified range;
and step 3: completing head detection on a CPU or a GPU;
step 3.1: and (3) feature calculation: extracting the characteristics of the image by using a convolution kernel of 3 x 3; then, in order to reduce the calculation time of the features, the original convolution operation is changed into a depth separable convolution operation, the feature mapping is subjected to mathematical transformation by utilizing normalized Batch-Normalization and a nonlinear activation function Relu, and the dimensionality of the feature mapping is reduced in a multiplied mode by utilizing a maximum value down-sampling maxporoling method; performing convolution on the feature mapping by using convolution with the size of 1 × 1 and the step length of 1, and performing mathematical transformation on the feature mapping by using normalized Batch-Normalization and a nonlinear activation function Relu to ensure the number of channels of the feature mapping;
step 3.2: feature extraction: because the target sizes are different, feature mappings of different dimensions need to be extracted;
step 3.3: extracting an anchor frame: on feature maps of different dimensions; selecting four frames with different scales by taking each feature point as a center; taking all frames as candidate frames of target classification and regression;
step 3.4: and (3) target classification: convolving the feature mapping by using depth separable convolution, obtaining the prediction score of the candidate frame obtained in the step 2.3 for each category, and screening the candidate frame by using an interaction over Union (Ious) and non-maximum suppression (NMS) method;
step 3.5: target regression: convolving the feature mapping by using depth separable convolution, obtaining predicted values of the center point and the width and the height of the candidate frame obtained in the step 2.3, and screening the candidate frame by using an interaction over Union (Ious) and non-maximum suppression (NMS) method
And 4, step 4: tracking the head: training a relevant filter by the relevant kernel filter according to the information of the previous and next frames, and carrying out relevance calculation on the relevant filter and the newly input frame to obtain a confidence image which is a predicted tracking result; the point or block with the highest score is the tracking result;
and 5: and (3) passenger flow volume counting: if the passenger has finished getting on/off the bus, the algorithm will update the data of 1 increase to the passenger flow, otherwise, the passenger flow will remain unchanged
Step 6: storing the coordinates and width and height of the passenger head detected by each frame of image into a file of a detection result, and storing each frame of image; when the bus stops running, the method can output the final passenger flow.

Claims (10)

1. A real-time bus passenger flow rate statistical method is characterized by comprising the following steps:
step 1: front-end data acquisition:
step 1.1: video acquisition: vertically installing a camera right above a bus door, and acquiring an image video of passengers getting on and off the bus;
step 1.2: video framing, namely framing the video;
dividing into 640 × 480 RGB three-channel images;
step 1.3: image zooming: scaling each frame of image into 224 × 224 data;
step 2: model training:
step 2.1: and (3) feature calculation: extracting the characteristics of the image by using a convolution kernel of 3 x 3; then, in order to reduce the calculation time of the features, the original convolution operation is changed into a depth separable convolution operation, a series of mathematical transformation is carried out on the feature mapping by utilizing a normalization function and a nonlinear activation function, and the dimensionality of the feature mapping is reduced by times by utilizing a maximum value down-sampling method; performing convolution on the feature mapping by using convolution with the size of 1 x 1 and the step length of 1, and performing mathematical transformation on the feature mapping by using normalization and a nonlinear activation function to ensure the number of channels of the feature mapping;
step 2.2: feature extraction: because the target sizes are different, feature mappings of different dimensions need to be extracted;
step 2.3: extracting an anchor frame: on feature mapping with different dimensions, taking each feature point as a center, selecting four frames with different scales, and taking all the frames as candidate frames for target classification and regression;
step 2.4: and (3) target classification: carrying out convolution on the feature mapping by utilizing the depth separable convolution, obtaining the prediction score of the candidate box obtained in the step 2.3 for each category, comparing the prediction score with the ground truth value, and calculating the cross entropy of the candidate box and the ground truth value to obtain the classification loss; optimizing the parameters of the network by using a random gradient descent method until the classification loss reaches a specified range;
step 2.5: target regression: performing convolution on the feature mapping by using depth separable convolution, obtaining the central point and width and height predicted values of the candidate frame obtained in the step 2.3, comparing the predicted values with ground truth values, and calculating linear regression functions of the predicted values and the ground truth values to obtain linear regression loss; optimizing the parameters of the network by using a random gradient descent method until the linear regression loss reaches a specified range;
and step 3: head detection:
step 3.1: and (3) feature calculation: extracting the characteristics of the image by using a convolution kernel of 3 x 3; then, in order to reduce the calculation time of the features, the original convolution operation is changed into a depth separable convolution operation, a series of mathematical transformation is carried out on the feature mapping by utilizing a normalization function and a nonlinear activation function, and the dimensionality of the feature mapping is reduced by times by utilizing a maximum value down-sampling method; performing convolution on the feature mapping by using convolution with the size of 1 x 1 and the step length of 1, and performing mathematical transformation on the feature mapping by using normalization and a nonlinear activation function to ensure the number of channels of the feature mapping;
step 3.2: feature extraction: because the target sizes are different, feature mappings of different dimensions need to be extracted;
step 3.3: extracting an anchor frame: on feature maps of different dimensions; selecting four frames with different scales by taking each feature point as a center; taking all frames as candidate frames of target classification and regression;
step 3.4: and (3) target classification: carrying out convolution on the feature mapping by utilizing depth separable convolution, obtaining the prediction score of the candidate frame obtained in the step 2.3 to each category, and screening the candidate frame by utilizing a non-maximum value inhibition method;
step 3.5: target regression: convolving the feature mapping by using depth separable convolution to obtain the central point and width and height predicted values of the candidate frame obtained in the step 2.3, and screening the candidate frame by using a method of suppressing non-maximum values
And 4, step 4: tracking the head:
tracking the detection window by using a related kernel filter method, forming a track, and if the track crosses a specified limit, indicating that the passenger finishes the action of getting on or off the vehicle;
and 5: and (3) passenger flow volume counting:
if the passenger has finished getting on or off, the algorithm will update the data with the addition of 1 to the passenger flow, otherwise, the passenger flow will remain unchanged.
2. The method for real-time bus passenger flow statistics according to claim 1, wherein the specific process of the step 2.1 is as follows: firstly, carrying out convolution kernel with the size of 3 x 3 and the step length of 1 on an RGB image, then carrying out mathematical transformation on feature mapping by utilizing a normalization and nonlinear activation function, and carrying out down-sampling on the feature mapping by utilizing a maximum value down-sampling method; then, performing convolution on the feature mapping by using depth separable convolution with the size of 3 x 3 and the step length of 1, performing mathematical transformation on the feature mapping by using a normalization and nonlinear activation function, and performing down-sampling on the feature mapping by using a maximum value down-sampling method; and (3) performing convolution on the feature mapping by using the convolution with the size of 1 × 1 and the step length of 1, and performing mathematical transformation on the feature mapping by using the normalization and the nonlinear activation function to ensure the channel number of the feature mapping.
3. The method for real-time statistics of the passenger flow volume of the bus according to claim 1, wherein the selection method of the feature mapping in the step 2.2 is as follows: all the different feature mappings need to be selected due to the different sizes of the targets; for small targets, a larger feature mapping needs to be selected for classification and regression; for large targets, a smaller feature map needs to be selected for classification and regression.
4. The method for real-time bus passenger flow statistics according to claim 1, wherein the selection method of the candidate frame in the step 2.3 comprises the following steps: on feature maps of different dimensions; selecting four frames with different scales by taking each feature point as a center; this allows the entire area of the image to be covered to avoid missing portions.
5. The method for real-time bus passenger flow statistics according to claim 1, characterized in that the cross entropy loss in step 2.4 is calculated as follows:
yithe real value of the ground is shown,indicates the predicted value, and n is the number of candidate frames.
6. The method for real-time statistics of bus passenger flow according to claim 1, characterized in that the linear regression loss in step 2.5 is calculated as follows:
xi,yi,wi,hithe real value of the ground is shown,denotes a prediction value, α is a coefficient, and n is the number of candidate frames.
7. The method according to claim 6, wherein the step 2.5 further obtains the total loss L, and the calculation method of L is as follows:
L=Lloc+βLreg
wherein β is a coefficient;
and optimizing the parameters of the network by using a random gradient descent method until the total loss reaches a specified range.
8. The method for real-time bus passenger flow statistics according to any one of claims 1-7, characterized in that in step 2 and step 3, the end-to-end operation is completed on the CPU or GPU.
9. The method for real-time statistics of bus passenger flow according to any one of claims 1-7, wherein in step 4, a correlation kernel filter is trained according to the information of the previous and next frames, and correlation calculation is performed with the newly inputted frame, and the obtained confidence map is the predicted tracking result; the point or block with the highest score is the tracking result.
10. The method for real-time statistics of bus passenger flow according to any one of claims 1-7, characterized by further comprising:
step 6: storing the coordinates and width and height of the passenger head detected by each frame of image into a file of a detection result, and storing each frame of image; when the bus stops running, the method can output the final passenger flow.
CN201910869554.8A 2019-09-16 2019-09-16 Real-time bus passenger flow volume statistical method Pending CN110610159A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910869554.8A CN110610159A (en) 2019-09-16 2019-09-16 Real-time bus passenger flow volume statistical method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910869554.8A CN110610159A (en) 2019-09-16 2019-09-16 Real-time bus passenger flow volume statistical method

Publications (1)

Publication Number Publication Date
CN110610159A true CN110610159A (en) 2019-12-24

Family

ID=68892777

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910869554.8A Pending CN110610159A (en) 2019-09-16 2019-09-16 Real-time bus passenger flow volume statistical method

Country Status (1)

Country Link
CN (1) CN110610159A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2704060A2 (en) * 2012-09-03 2014-03-05 Vision Semantics Limited Crowd density estimation
CN105163121A (en) * 2015-08-24 2015-12-16 西安电子科技大学 Large-compression-ratio satellite remote sensing image compression method based on deep self-encoding network
CN107609512A (en) * 2017-09-12 2018-01-19 上海敏识网络科技有限公司 A kind of video human face method for catching based on neutral net
CN109117794A (en) * 2018-08-16 2019-01-01 广东工业大学 A kind of moving target behavior tracking method, apparatus, equipment and readable storage medium storing program for executing
CN109285376A (en) * 2018-08-09 2019-01-29 同济大学 A kind of bus passenger flow statistical analysis system based on deep learning
CN110147807A (en) * 2019-01-04 2019-08-20 上海海事大学 A kind of ship intelligent recognition tracking

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2704060A2 (en) * 2012-09-03 2014-03-05 Vision Semantics Limited Crowd density estimation
CN105163121A (en) * 2015-08-24 2015-12-16 西安电子科技大学 Large-compression-ratio satellite remote sensing image compression method based on deep self-encoding network
CN107609512A (en) * 2017-09-12 2018-01-19 上海敏识网络科技有限公司 A kind of video human face method for catching based on neutral net
CN109285376A (en) * 2018-08-09 2019-01-29 同济大学 A kind of bus passenger flow statistical analysis system based on deep learning
CN109117794A (en) * 2018-08-16 2019-01-01 广东工业大学 A kind of moving target behavior tracking method, apparatus, equipment and readable storage medium storing program for executing
CN110147807A (en) * 2019-01-04 2019-08-20 上海海事大学 A kind of ship intelligent recognition tracking

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
陈响: "面向公交车场景的视频关键帧提取算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
黄俊洁 等: "基于全局和局部卷积特征融合的车辆目标检测", 《西南科技大学学报》 *

Similar Documents

Publication Publication Date Title
CN108229338B (en) Video behavior identification method based on deep convolution characteristics
CN108053419B (en) Multi-scale target tracking method based on background suppression and foreground anti-interference
CN108550161B (en) Scale self-adaptive kernel-dependent filtering rapid target tracking method
CN110033473B (en) Moving target tracking method based on template matching and depth classification network
CN109583483B (en) Target detection method and system based on convolutional neural network
CN110120064B (en) Depth-related target tracking algorithm based on mutual reinforcement and multi-attention mechanism learning
CN112257569B (en) Target detection and identification method based on real-time video stream
CN105160310A (en) 3D (three-dimensional) convolutional neural network based human body behavior recognition method
CN111401293B (en) Gesture recognition method based on Head lightweight Mask scanning R-CNN
CN109087337B (en) Long-time target tracking method and system based on hierarchical convolution characteristics
CN111738344A (en) Rapid target detection method based on multi-scale fusion
CN113822352B (en) Infrared dim target detection method based on multi-feature fusion
CN106529441B (en) Depth motion figure Human bodys' response method based on smeared out boundary fragment
CN114463677A (en) Safety helmet wearing detection method based on global attention
CN113850136A (en) Yolov5 and BCNN-based vehicle orientation identification method and system
CN111414938B (en) Target detection method for bubbles in plate heat exchanger
CN112926552A (en) Remote sensing image vehicle target recognition model and method based on deep neural network
CN113763427A (en) Multi-target tracking method based on coarse-fine shielding processing
CN107247967B (en) Vehicle window annual inspection mark detection method based on R-CNN
CN112396036A (en) Method for re-identifying blocked pedestrians by combining space transformation network and multi-scale feature extraction
CN113963333B (en) Traffic sign board detection method based on improved YOLOF model
CN110008834B (en) Steering wheel intervention detection and statistics method based on vision
CN113256683B (en) Target tracking method and related equipment
CN107679467B (en) Pedestrian re-identification algorithm implementation method based on HSV and SDALF
CN112767450A (en) Multi-loss learning-based related filtering target tracking method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned

Effective date of abandoning: 20231229

AD01 Patent right deemed abandoned