WO2020114116A1 - 一种基于密集人群的行人检测方法、存储介质及处理器 - Google Patents
一种基于密集人群的行人检测方法、存储介质及处理器 Download PDFInfo
- Publication number
- WO2020114116A1 WO2020114116A1 PCT/CN2019/112433 CN2019112433W WO2020114116A1 WO 2020114116 A1 WO2020114116 A1 WO 2020114116A1 CN 2019112433 W CN2019112433 W CN 2019112433W WO 2020114116 A1 WO2020114116 A1 WO 2020114116A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pedestrian
- detection method
- method based
- pixels
- pedestrian detection
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 41
- 238000000034 method Methods 0.000 claims abstract description 8
- 238000010586 diagram Methods 0.000 description 8
- 238000013527 convolutional neural network Methods 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 238000009432 framing Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000036544 posture Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/53—Recognition of crowd images, e.g. recognition of crowd congestion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Definitions
- the invention relates to the field of target detection, in particular to a pedestrian detection method, storage medium and processor based on dense crowds.
- Target detection models mainly include R-CNN, Fast-R-CNN, Faster-R-CNN, YOLO, SSD
- the YOLO target detection model uses gridding and plotting methods, which greatly simplifies the process of generating candidate frames, and the overall performance of the model is superior.
- YOLOv3 uses the Darknet-53 basic network, using the following 9 anchor points, the length and width are (in pixels): (10, 13), (16, 30), (33, 23), (30, 61), ( 62,45), (59,119), (116,90), (156,198), (373,326).
- the selection of these trace points is based on 9 anchor points clustered on the COCO dataset.
- the proportion of the target object in the image is relatively large. When a scene with a small proportion of the target object appears, such as a public area camera, in order to In the monitoring of more areas, the proportion of the portraits in the picture in the whole picture is very small, and pedestrians (targets) gather. In order to be able to detect smaller targets, it is necessary to classify the target classification confidence threshold appropriately Reduction, but this will bring another problem: will recognize multiple people (targets) as a target, which causes the problem of inaccurate identification of targets.
- the technical problem to be solved by the present invention is to provide a pedestrian detection method based on dense crowd, which can realize the detection of small targets, and at the same time can realize the detection of dense targets (pedestrians).
- an embodiment of the present invention provides a pedestrian detection method based on dense crowd, including: training the yolo model using the COCO data set; framing the target with a rectangular frame and widening the rectangular frame The sum height is a group and clustered into a preset number of classes to obtain a preset number of group data anchor points;
- a training model is formed according to the size and proportion of the width and height clustering points of a group of pedestrians; the obtained training model is used to predict the pedestrian to be recognized in the image to identify the pedestrian in the image to be recognized.
- the size and ratio of the plot points according to the width-height cluster of a group of pedestrians refer to the ratio of the length and width of the rectangle used to frame the target object.
- the preset number refers to six.
- the use of the obtained training model to predict the pedestrian to be recognized includes: inputting the image of the pedestrian to be recognized into the training model, and the training model outputs a prediction result of the number of pedestrians.
- the target sample is 2000-3000.
- the overlapping degree in which the target object is framed by a rectangular frame is greater than a threshold of 50%.
- the aspect ratio of the anchor point of the data set is 1:1.
- the aspect ratio of the anchor point of the data set is 1:1.5.
- the aspect ratio of the anchor point of the data set is 1:2.
- an embodiment of the present invention provides a storage medium, the storage medium includes a stored program, wherein the above-mentioned pedestrian detection method based on dense crowd is executed when the program runs.
- an embodiment of the present invention provides a processor for running a program, wherein the above-mentioned pedestrian detection method based on dense crowd is executed when the program is running.
- the above technical solution has the following advantages: achieving dense target detection, especially accurate detection of the target when the target is occluded and overlapping; the target can be detected smaller; the detection speed is faster, and it can be generally applied to large shopping mall , Shopping malls, chain stores, airports, stations, museums, exhibition halls and other public places.
- FIG. 1 is a flowchart of a pedestrian detection method based on a dense crowd of the present invention.
- FIG. 2 is a schematic diagram of the size of nine drawing points in the prior art.
- FIG. 3 is a schematic diagram of six tracing points used in the pedestrian detection method based on a dense crowd of the present invention.
- Fig. 4 is a schematic diagram of a target object predicted by using a COCO data set to train a yolo model using 9 stroke points in the prior art.
- FIG. 5 is a schematic diagram of a target object predicted by training the yolo model using 6 stroke point size COCO data sets after optimization using the present invention.
- COCO Common Objects in COntext, which is a data set provided by the Microsoft team that can be used for image recognition.
- the human visual system is fast and precise, and you can identify the objects and their positions in the image with a single glance at YOLO (You Only Look Once).
- YOLO is a new target detection method. This method is characterized by rapid detection and high accuracy.
- YOLO's network structure The model uses a convolutional neural network structure. The initial convolutional layer extracts image features, and the fully connected layer predicts the output probability.
- FIG. 1 is a flowchart of a pedestrian detection method based on a dense crowd of the present invention. As shown in Figure 1, a pedestrian detection method based on dense crowd includes steps:
- COCO data set is an open source 80 classification, and the data labeling specification, the samples of each class are more than 2000, so the trained model is a benchmark.
- the anchor point is a group of 3, which was originally 3 groups, and is now reduced to 2 groups, which reduces the amount of calculation without a significant decrease in performance (the reference is based on yolo-tiny).
- the target sample is selected from 2000 to 3000. According to the empirical value, it is generally necessary to detect an object classification. Attributes can be detected by training about 2000 to 3000 sample data.
- the overlapping degree of the target framed by the rectangular frame is greater than the threshold 50%.
- the ratio (S1/S2) of the union S1 of the predicted target frame and the actual target object to the collection S2 of the predicted target frame and the actual target object is the threshold.
- the aspect ratio of the anchor point of the data set is set to 1:1. By setting such an aspect ratio, it is easier to frame a square target (such as a pedestrian frame that sits or squats).
- the aspect ratio of the anchor point of the data set may also be set to 1:1.5.
- a medium-length bar target such as a frame when a pedestrian spreads his arms or a backpack.
- the aspect ratio of the anchor point of the data group can also be set to 1:2. By setting such an aspect ratio, this can frame upright pedestrian targets.
- FIG. 2 is a schematic diagram of the size of nine drawing points in the prior art. As shown in Figure 2, the unit of illustration is pixels, and the length and width of the 9 trace points are (10 pixels, 13 pixels), (16 pixels, 30 pixels), (33 pixels, 23 pixels), (30 pixels, 61 pixels) , (62 pixels, 45 pixels), (59 pixels, 119 pixels), (116 pixels, 90 pixels), (156 pixels, 198 pixels), (373 pixels, 326 pixels).
- FIG. 3 is a schematic diagram of 6 tracing points used in a pedestrian detection method based on a dense crowd of the present invention. As shown in Figure 3, the COCO data set is used to train the yolo model.
- the length and width of the six optimized plots are (3 pixels, 5 pixels), (10 pixels, 14 pixels), (23 pixels, 27 pixels), (37 Pixels, 58 pixels), (81 pixels, 82 pixels), (135 pixels, 169 pixels).
- FIG. 3 is a schematic diagram of a target predicted by using the 9 stroke points in the prior art and using the COCO dataset to train the yolo model.
- the length and width of the 9 trace points are (10 pixels, 13 pixels), (16 pixels, 30 pixels), (33 pixels, 23 pixels), (30 pixels, 61 pixels), (62 pixels, 45 pixels), (59 pixels , 119 pixels), (116 pixels, 90 pixels), (156 pixels, 198 pixels), (373 pixels, 326 pixels).
- the anchor point (373 pixels, 326 pixels
- the anchor point can frame the target, but it is easy to cause a frame Pedestrians are inside the phenomenon, and the output characteristics can easily exceed the threshold.
- the original non-pedestrian situation is easy to be misjudged as a pedestrian, which is inefficient in detecting pedestrians and wastes time.
- FIG. 4 is a schematic diagram of a target object predicted by training a yolo model using a 6-point COCO data set after optimization using the present invention.
- the length and width of the 6 plot points are (3 pixels, 5 pixels), (10 pixels, 14 pixels), (23 pixels, 27 pixels), (37 pixels, 58 pixels), (81 pixels, 82 pixels), (135 pixels, 169 pixels).
- the detection target is a pedestrian, frame the target with a rectangular frame, use 2000 to 3000 samples, cluster the width and height of the rectangular frame into a group into 6 groups, and obtain a data set similar to the above 6 groups, replace The original 9 sets of data.
- the number of rectangular frames is reduced from 9 to 6, the amount of calculation is also significantly reduced.
- the dense crowd-based pedestrian detection method according to the present invention uses the dense crowd-based pedestrian detection method according to the present invention to achieve dense target detection, especially when the target is occluded and overlapped, to achieve accurate detection of the target; the target can be detected smaller; the detection speed is faster and can be universal Applicable to large shopping malls, shopping malls, chain stores, airports, stations, museums, exhibition halls and other public places.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (9)
- 一种基于密集人群的行人检测方法,其特征在于,包括:采用COCO数据集训练yolo模型;用矩形框将目标物框住,将矩形框的宽和高为一组,聚类到预设个数的类中,得到预设个数的组数据锚点;根据一组行人的宽高聚类描点大小及比例形成训练模型;运用得到的训练模型来对待识别图像的行人进行预测,以识别所述待识别图像中的行人。
- 根据权利要求1所述的基于密集人群的行人检测方法,其特征在于,所述根据一组行人的宽高聚类描点大小及比例指的是用于框住目标物的矩形的长和宽的比例。
- 根据权利要求1所述的基于密集人群的行人检测方法,其特征在于,所述预设个数指的是六个。
- 根据权利要求1所述的基于密集人群的行人检测方法,其特征在于,所述运用得到的训练模型来对待识别图像的行人进行预测包括:将待识别行人的图像输入所述训练模型,所述训练模型输出行人数量的预测结果。
- 根据权利要求1所述的基于密集人群的行人检测方法,其特征在于,所述目标物样本为2000~3000份。
- 根据权利要求1所述的基于密集人群的行人检测方法,其特征在于,用矩形框将目标物框住的重叠度大于阈值50%。
- 根据权利要求1所述的基于密集人群的行人检测方法,其特征在于,数据组锚点长宽比为1:1或1:1.5或1:2。
- 一种存储介质,其特征在于,所述存储介质包括存储的程序,其中,所述程序运行时执行权利要求1至7中任一项所述的基于密集人群的行人检测方法。
- 一种处理器,其特征在于,所述处理器用于运行程序,其中,所述程序运行时执行权利要求1至7中任一项所述的基于密集人群的行人检测方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811485647.2 | 2018-12-06 | ||
CN201811485647.2A CN111291587A (zh) | 2018-12-06 | 2018-12-06 | 一种基于密集人群的行人检测方法、存储介质及处理器 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020114116A1 true WO2020114116A1 (zh) | 2020-06-11 |
Family
ID=70974982
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/112433 WO2020114116A1 (zh) | 2018-12-06 | 2019-10-22 | 一种基于密集人群的行人检测方法、存储介质及处理器 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111291587A (zh) |
WO (1) | WO2020114116A1 (zh) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111931729A (zh) * | 2020-09-23 | 2020-11-13 | 平安国际智慧城市科技股份有限公司 | 基于人工智能的行人检测方法、装置、设备及介质 |
CN112686340A (zh) * | 2021-03-12 | 2021-04-20 | 成都点泽智能科技有限公司 | 一种基于深度神经网络的密集小目标检测方法 |
CN112966618A (zh) * | 2021-03-11 | 2021-06-15 | 京东数科海益信息科技有限公司 | 着装识别方法、装置、设备及计算机可读介质 |
CN113568407A (zh) * | 2021-07-27 | 2021-10-29 | 山东中科先进技术研究院有限公司 | 一种基于深度视觉的人机协作安全预警方法及*** |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112634202A (zh) * | 2020-12-04 | 2021-04-09 | 浙江省农业科学院 | 一种基于YOLOv3-Lite的混养鱼群行为检测的方法、装置及*** |
CN113158897A (zh) * | 2021-04-21 | 2021-07-23 | 新疆大学 | 一种基于嵌入式YOLOv3算法的行人检测*** |
CN115994887B (zh) * | 2022-09-06 | 2024-01-09 | 江苏济远医疗科技有限公司 | 一种基于动态锚点的医学图像密集目标分析方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106960195A (zh) * | 2017-03-27 | 2017-07-18 | 深圳市丰巨泰科电子有限公司 | 一种基于深度学习的人群计数方法及装置 |
CN107169421A (zh) * | 2017-04-20 | 2017-09-15 | 华南理工大学 | 一种基于深度卷积神经网络的汽车驾驶场景目标检测方法 |
CN107273836A (zh) * | 2017-06-07 | 2017-10-20 | 深圳市深网视界科技有限公司 | 一种行人检测识别方法、装置、模型和介质 |
CN107316001A (zh) * | 2017-05-31 | 2017-11-03 | 天津大学 | 一种自动驾驶场景中小且密集的交通标志检测方法 |
US10410120B1 (en) * | 2019-01-25 | 2019-09-10 | StradVision, Inc. | Learning method and testing method of object detector to be used for surveillance based on R-CNN capable of converting modes according to aspect ratios or scales of objects, and learning device and testing device using the same |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107624189B (zh) * | 2015-05-18 | 2020-11-20 | 北京市商汤科技开发有限公司 | 用于生成预测模型的方法和设备 |
-
2018
- 2018-12-06 CN CN201811485647.2A patent/CN111291587A/zh active Pending
-
2019
- 2019-10-22 WO PCT/CN2019/112433 patent/WO2020114116A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106960195A (zh) * | 2017-03-27 | 2017-07-18 | 深圳市丰巨泰科电子有限公司 | 一种基于深度学习的人群计数方法及装置 |
CN107169421A (zh) * | 2017-04-20 | 2017-09-15 | 华南理工大学 | 一种基于深度卷积神经网络的汽车驾驶场景目标检测方法 |
CN107316001A (zh) * | 2017-05-31 | 2017-11-03 | 天津大学 | 一种自动驾驶场景中小且密集的交通标志检测方法 |
CN107273836A (zh) * | 2017-06-07 | 2017-10-20 | 深圳市深网视界科技有限公司 | 一种行人检测识别方法、装置、模型和介质 |
US10410120B1 (en) * | 2019-01-25 | 2019-09-10 | StradVision, Inc. | Learning method and testing method of object detector to be used for surveillance based on R-CNN capable of converting modes according to aspect ratios or scales of objects, and learning device and testing device using the same |
Non-Patent Citations (1)
Title |
---|
CHUCHU ZHANG ET AL: "Dense Crowd Scene Pedestrian Detection Based on Improved YOLOv2 Network", MODERN COMPUTER, no. 28, 5 October 2018 (2018-10-05), pages 34 - 39, XP009521586, DOI: 10.3969/j.issn.1007-1423.2018.28.009 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111931729A (zh) * | 2020-09-23 | 2020-11-13 | 平安国际智慧城市科技股份有限公司 | 基于人工智能的行人检测方法、装置、设备及介质 |
CN111931729B (zh) * | 2020-09-23 | 2021-01-08 | 平安国际智慧城市科技股份有限公司 | 基于人工智能的行人检测方法、装置、设备及介质 |
CN112966618A (zh) * | 2021-03-11 | 2021-06-15 | 京东数科海益信息科技有限公司 | 着装识别方法、装置、设备及计算机可读介质 |
CN112966618B (zh) * | 2021-03-11 | 2024-02-09 | 京东科技信息技术有限公司 | 着装识别方法、装置、设备及计算机可读介质 |
CN112686340A (zh) * | 2021-03-12 | 2021-04-20 | 成都点泽智能科技有限公司 | 一种基于深度神经网络的密集小目标检测方法 |
CN113568407A (zh) * | 2021-07-27 | 2021-10-29 | 山东中科先进技术研究院有限公司 | 一种基于深度视觉的人机协作安全预警方法及*** |
Also Published As
Publication number | Publication date |
---|---|
CN111291587A (zh) | 2020-06-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020114116A1 (zh) | 一种基于密集人群的行人检测方法、存储介质及处理器 | |
CN106096577B (zh) | 一种摄像头分布地图中的目标追踪方法 | |
CN110609920B (zh) | 一种视频监控场景下的行人混合搜索方法及*** | |
CN104303193B (zh) | 基于聚类的目标分类 | |
CN109766868B (zh) | 一种基于身体关键点检测的真实场景遮挡行人检测网络及其检测方法 | |
CN103824070B (zh) | 一种基于计算机视觉的快速行人检测方法 | |
EP3343443A1 (en) | Object detection for video camera self-calibration | |
US20130148848A1 (en) | Method and apparatus for video analytics based object counting | |
TW202013252A (zh) | 車牌辨識系統與方法 | |
WO2021139049A1 (zh) | 检测方法、检测装置、监控设备和计算机可读存储介质 | |
CN110232379A (zh) | 一种车辆姿态检测方法及*** | |
CN102542289A (zh) | 一种基于多高斯计数模型的人流量统计方法 | |
CN103347167A (zh) | 一种基于分段的监控视频内容描述方法 | |
CN110969604B (zh) | 一种基于深度学习的智能安防实时开窗检测报警***及方法 | |
WO2023155482A1 (zh) | 一种人群快速聚集行为的识别方法、***、设备及介质 | |
CN104463232A (zh) | 一种基于hog特征和颜色直方图特征的密度人群计数的方法 | |
CN112464893A (zh) | 一种复杂环境下的拥挤度分类方法 | |
CN107103299B (zh) | 一种监控视频中的人数统计方法 | |
CN108471497A (zh) | 一种基于云台摄像机的船目标实时检测方法 | |
CN111027370A (zh) | 一种多目标跟踪及行为分析检测方法 | |
CN109582824A (zh) | 一种基于视频结构化的区域安全管理***及方法 | |
CN103646254A (zh) | 一种高密度行人检测方法 | |
CN107862713A (zh) | 针对轮询会场的摄像机偏转实时检测预警方法及模块 | |
CN111144209A (zh) | 一种基于异构多分支深度卷积神经网络的监控视频人头检测方法 | |
Pazzaglia et al. | People counting on low cost embedded hardware during the SARS-CoV-2 pandemic |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19892484 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19892484 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 24.09.2021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19892484 Country of ref document: EP Kind code of ref document: A1 |