CN112380960A - Crowd counting method, device, equipment and storage medium - Google Patents

Crowd counting method, device, equipment and storage medium Download PDF

Info

Publication number
CN112380960A
CN112380960A CN202011254152.6A CN202011254152A CN112380960A CN 112380960 A CN112380960 A CN 112380960A CN 202011254152 A CN202011254152 A CN 202011254152A CN 112380960 A CN112380960 A CN 112380960A
Authority
CN
China
Prior art keywords
head
shoulder detection
frame
image
frames
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011254152.6A
Other languages
Chinese (zh)
Inventor
林嘉鑫
赖蔚蔚
吴广财
郑杰生
郑颖龙
周昉昉
刘佳木
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Electric Power Information Technology Co Ltd
Original Assignee
Guangdong Electric Power Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Electric Power Information Technology Co Ltd filed Critical Guangdong Electric Power Information Technology Co Ltd
Priority to CN202011254152.6A priority Critical patent/CN112380960A/en
Publication of CN112380960A publication Critical patent/CN112380960A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a crowd counting method, a device, equipment and a storage medium, wherein the method comprises the following steps: sequentially inputting each frame of image in the acquired target video to a preset head and shoulder detection model for head and shoulder detection, and outputting a head and shoulder detection frame of each frame of image; matching each head and shoulder detection frame in two continuous frames of images, and judging that the two successfully matched head and shoulder detection frames are the same target; tracking the same target in the target video to obtain a tracking track; the number of the tracking tracks is calculated to obtain the people counting result in the target video, and the technical problem that the existing pedestrian detection method has large people detection error when the crowd is dense and the pedestrians are seriously shielded from each other is solved.

Description

Crowd counting method, device, equipment and storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for people counting.
Background
The video monitoring system enters an intelligent video monitoring era after the result simulates video monitoring and digital video monitoring. In intelligent video monitoring system, crowd density detects a core personage, especially in scenes such as garden, station, gathers crowd image data through the camera, rapid analysis and statistics number to report an emergency and ask for help avoiding appearing overcrowded, trample safety accident such as even to the high density crowd scene.
In the prior art, people are counted by a pedestrian detection method, and the method has the problem of large detection error of people when people are dense and pedestrians are seriously shielded.
Disclosure of Invention
The application provides a crowd counting method, a device, equipment and a storage medium, which are used for solving the technical problem that the existing pedestrian detection method is dense in crowd and has large detection error when pedestrians are seriously shielded.
In view of the above, the present application provides, in a first aspect, a crowd counting method, including:
sequentially inputting each frame of image in the acquired target video to a preset head and shoulder detection model for head and shoulder detection, and outputting a head and shoulder detection frame of each frame of image;
matching each head and shoulder detection frame in two continuous frames of the images, and judging that the two successfully matched head and shoulder detection frames are the same target;
tracking the same target in the target video to obtain a tracking track;
and calculating the number of the tracking tracks to obtain the people counting result in the target video.
Optionally, the preset head and shoulder detection model includes: the characteristic diagram reducing module and the multi-scale receptive field expanding module are connected with the characteristic diagram reducing module;
correspondingly, the head and shoulder detection frame for sequentially inputting each frame of image in the acquired target video to a preset head and shoulder detection model for head and shoulder detection and outputting each frame of image comprises:
sequentially inputting each frame of image in the acquired target video to a preset head and shoulder detection model, enabling the feature map reduction module to perform feature extraction on the input image and reduce the size of the extracted feature map, performing multi-scale processing on the reduced feature map by the multi-scale receptive field expansion module, performing head and shoulder detection frame prediction based on the extracted multi-scale features, and outputting a head and shoulder detection frame of each frame of image.
Optionally, the feature map reduction module includes: a first convolutional layer, a second convolutional layer, a third convolutional layer and a fourth convolutional layer;
wherein the convolution kernel size of the first convolution layer is 7 × 7, and the convolution kernel size of the second convolution layer, the third convolution layer, and the fourth convolution layer is 3 × 3.
Optionally, the multi-scale receptive field expansion module includes: an inclusion layer, a convolution layer and 3 prediction layers.
Optionally, the matching each of the head and shoulder detection frames in the two consecutive frames of images, and determining that the two successfully matched head and shoulder detection frames are the same target, includes:
calculating the intersection ratio between each head and shoulder detection frame in two continuous frames of the images, and when the maximum intersection ratio is greater than a preset threshold value, successfully matching the two corresponding head and shoulder detection frames by the maximum intersection ratio;
and judging that the two head and shoulder detection frames which are successfully matched are the same target.
A second aspect of the present application provides a people counting device comprising:
the output unit is used for sequentially inputting each frame of image in the acquired target video into a preset head and shoulder detection model for head and shoulder detection and outputting a head and shoulder detection frame of each frame of image;
the matching unit is used for matching each head and shoulder detection frame in the two continuous frames of images and judging that the two successfully matched head and shoulder detection frames are the same target;
the tracking unit is used for tracking the same target in the target video to obtain a tracking track;
and the calculating unit is used for calculating the number of the tracking tracks to obtain the people counting result in the target video.
Optionally, the preset head and shoulder detection model includes: the characteristic diagram reducing module and the multi-scale receptive field expanding module are connected with the characteristic diagram reducing module;
correspondingly, the output unit is specifically configured to:
sequentially inputting each frame of image in the acquired target video to a preset head and shoulder detection model, enabling the feature map reduction module to perform feature extraction on the input image and reduce the size of the extracted feature map, performing multi-scale processing on the reduced feature map by the multi-scale receptive field expansion module, performing head and shoulder detection frame prediction based on the extracted multi-scale features, and outputting a head and shoulder detection frame of each frame of image.
Optionally, the matching unit is specifically configured to:
calculating the intersection ratio between each head and shoulder detection frame in two continuous frames of the images, and when the maximum intersection ratio is greater than a preset threshold value, successfully matching the two corresponding head and shoulder detection frames by the maximum intersection ratio;
and judging that the two head and shoulder detection frames which are successfully matched are the same target.
A third aspect of the application provides a people counting device comprising a processor and a memory;
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the people counting method according to any of the first aspect according to instructions in the program code.
A fourth aspect of the present application provides a computer readable storage medium for storing program code for performing the people counting method of any one of the first aspect.
According to the technical scheme, the method has the following advantages:
the application provides a crowd counting method, which comprises the following steps: sequentially inputting each frame of image in the acquired target video to a preset head and shoulder detection model for head and shoulder detection, and outputting a head and shoulder detection frame of each frame of image; matching each head and shoulder detection frame in two continuous frames of images, and judging that the two successfully matched head and shoulder detection frames are the same target; tracking the same target in the target video to obtain a tracking track; and calculating the number of the tracking tracks to obtain the people counting result in the target video.
According to the method and the device, the head and the shoulder of each frame of image in the target video are detected through the preset head and shoulder detection model, so that false detection and missing detection caused by mutual shielding of people are avoided; the method comprises the steps of matching head and shoulder detection frames in two continuous frames of images, determining the head and shoulder detection frames belonging to the same target in the two continuous frames of images, tracking the head and shoulder detection frames, and finally determining the people counting result in a target video through the number of tracking tracks.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a schematic flow chart of a crowd counting method according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a preset head and shoulder detection model according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a crowd counting apparatus according to an embodiment of the present disclosure.
Detailed Description
The application provides a crowd counting method, a device, equipment and a storage medium, which are used for solving the technical problem that the existing pedestrian detection method is dense in crowd and has large detection error when pedestrians are seriously shielded.
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
For ease of understanding, referring to fig. 1, the present application provides an embodiment of a people counting method, comprising:
and 101, sequentially inputting each frame of image in the acquired target video into a preset head and shoulder detection model for head and shoulder detection, and outputting a head and shoulder detection frame of each frame of image.
The video of the crowd dense area is obtained through the camera to obtain the target video, and the target video can be subjected to frame division to obtain each frame of image.
The existing pedestrian detection network is large in size and high in calculation resource cost, and in order to solve the problem, the preset head and shoulder detection model in the embodiment of the application is a lightweight neural network model and mainly comprises two parts: a characteristic diagram reducing module and a multi-scale receptive field expanding module. And sequentially inputting each frame of image in the target video into a preset head and shoulder detection model, so that a feature map reduction module performs feature extraction on the input image, reduces the size of the extracted feature map, a multi-scale receptive field expansion module performs multi-scale processing on the reduced feature map, performs head and shoulder detection frame prediction based on the extracted multi-scale features, and outputs a head and shoulder detection frame of each frame of image.
Further, the preset head and shoulder detection model can be structured as shown in fig. 2, and the feature diagram reduction module can rapidly reduce the space size of the feature diagram and increase the network operation speed. The feature map reduction module comprises: a first convolutional layer, a second convolutional layer, a third convolutional layer and a fourth convolutional layer; the convolution kernel size of the first convolution layer Conv1 is 7 × 7, and the convolution kernel sizes of the second convolution layer Conv2, the third convolution layer Conv3 and the fourth convolution layer Conv4 are 3 × 3.
The characteristic diagram reducing module firstly adopts the convolution kernel size of 7 × 7 and the convolution layer with the step length of 4 to rapidly reduce the size of the input image, thereby greatly reducing the size of the characteristic diagram processed by the subsequent convolution layer and further reducing the calculated amount. Meanwhile, the convolution kernel parameter quantity and the receptive field of 7 × 7 are relatively large, the extracted features are richer, and the loss of feature information caused by the rapid reduction of the size of the feature map can be reduced. After the first convolution, the feature size is further reduced using a convolution layer of 3 x 3 convolution kernel size with step size 2. And then, a convolution layer with the convolution kernel size of 3 x 3 and the step length of 1 is adopted, so that the rapid loss of the characteristic information is reduced on one hand, and the network depth is deepened on the other hand, so that the network extracts more accurate depth characteristics. Finally, the fourth convolutional layer Conv4 quickly reduced the feature map size to 1/16 as input.
The rapid feature map reduction module can greatly increase the speed and reduce the precision problem caused by feature information loss, thereby not only accelerating the running speed of the model, but also keeping the model at higher precision. In addition, the network sets the number of convolution kernels of Conv1, Conv2, Conv3 and Conv4 to 12, 24 and 48 respectively, parameter redundancy is reduced, and operation efficiency is further improved.
Further, referring to fig. 2, the multi-scale receptive field expansion module includes: an inclusion layer, a convolutional layer (expansion convolutional layer), and 3 prediction layers. The multi-scale receptive field expansion module is used for expanding a target-associated receptive field and providing rich context semantic information for the head-shoulder target characteristics in combination with the form of the multi-scale receptive field, the multi-scale receptive field expansion module in the embodiment of the application designs a more adaptive expansion rate (the expansion rate is preferably 3) for head-shoulder data distribution, and greatly reduces the precision loss caused by the multi-branch expansion convolutional layer, and the scale of the receptive field can be increased by the expansion convolutional layer with the expansion rate of 3.
After receptive field scale amplification and multi-scale generation, the multi-scale receptive field expansion module predicts the head-shoulder target in a manner that predicts separately on 3 different convolutional layers. Setting a 1 st prediction layer after the inclusion, setting a second prediction layer after Conv6_1, setting a third prediction layer after Conv9_1, and adopting prior box design with different scales. Because in the head-shoulder detection, the head-shoulder target aspect ratio approaches 1: therefore, in order to efficiently regress the target and save the calculation amount by the prior frame, the aspect ratio of the embodiment of the present application is 1: 1, prior box. The robustness of the detector can be effectively improved by combining the layered prediction with the multi-scale prior frame design. The loss functions of the preset head and shoulder detection model in the embodiment of the application comprise a Softmax loss function and a Smooth L1 loss function, wherein the Softmax loss function is mainly used for performing loss calculation on a predicted target category; the Smooth L1 loss function is used to regress the predicted and actual detection boxes.
For the head and shoulder detection frame results of the three prediction layers, the non-maximum inhibition method is adopted to screen the head and shoulder detection frames, and the optimal head and shoulder detection frame is output.
And 102, matching each head and shoulder detection frame in two continuous frames of images, and judging that the two successfully matched head and shoulder detection frames are the same target.
After detection, each head and shoulder target of the current frame image corresponds to one head and shoulder detection frame, and then detection results may be missed and false due to the detection precision of the detector. Therefore, in order to improve the accuracy of the crowd counting method, the multi-target tracking algorithm is added on the basis of detection to correct the detection result, and the tracking track of each head and shoulder target in the continuous video frames is obtained.
In the embodiment of the application, the IOU (IOU is the intersection and parallel ratio between two frames) between the head and shoulder detection frames detected by the front and rear frames of images is used as the association basis, so that all the head and shoulder detection frames in the two frames are directly matched without considering the appearance information of the detection target and predicting the motion trail.
Further, the matching process may be: calculating the intersection ratio between each head and shoulder detection frame in two continuous frames of images, and when the maximum intersection ratio is greater than a preset threshold value, successfully matching the two corresponding head and shoulder detection frames by the maximum intersection ratio; and judging that the two head and shoulder detection frames which are successfully matched are the same target. Specifically, the IOU between each head and shoulder detection frame in the current frame image and each head and shoulder detection frame in the previous frame image is calculated, when each frame image is processed, for each tracked target, a maximum IOU between the detected head and shoulder detection frames and the position before the maximum IOU is selected from the detected head and shoulder detection frames, if the maximum IOU is larger than a preset threshold value, the two head and shoulder detection frames corresponding to the maximum IOU are judged to be matched, the two head and shoulder detection frames corresponding to the maximum IOU are judged to be the same target, and otherwise, the matching fails.
And 103, tracking the same target in the target video to obtain a tracking track.
The same target in the target video is tracked, and each target correspondingly obtains a tracking track tracklet. If a tracklet match fails, the target is considered to be off. If there is a head-shoulder detection box that does not match the tracklet, then the newly-appearing target is considered and a new tracklet is created for it.
The embodiment of the application tracks the head and shoulder target detection frame, when the same target is detected in continuous N frames of images (which can be continuous 3 frames of images), the target starts to be tracked, and if the target is not detected in continuous M frames of images (which can be continuous 10 frames of images) after the last detection, the tracking is finished.
And step 104, calculating the number of the tracking tracks to obtain the people counting result in the target video.
And finally, determining the number of people in the target video according to the number of the tracking tracks. Compared with the crowd counting strategy based on single-frame image target detection, the crowd counting strategy based on tracking and detection further improves the precision and robustness of crowd counting.
In the embodiment of the application, the head and the shoulder of each frame of image in the target video are detected through the preset head and shoulder detection model, so that false detection and missing detection caused by mutual shielding of people are avoided; the method comprises the steps of matching head and shoulder detection frames in two continuous frames of images, determining the head and shoulder detection frames belonging to the same target in the two continuous frames of images, tracking the head and shoulder detection frames, and finally determining the people counting result in a target video through the number of tracking tracks.
The above is a crowd counting method provided by the present application, and the following is a crowd counting device provided by the embodiment of the present application.
Referring to fig. 3, an embodiment of a crowd counting apparatus provided in the present application includes:
the output unit 201 is configured to sequentially input each frame of image in the acquired target video to a preset head and shoulder detection model for head and shoulder detection, and output a head and shoulder detection frame of each frame of image;
a matching unit 202, configured to match each head and shoulder detection frame in two consecutive frames of images, and determine that two successfully matched head and shoulder detection frames are the same target;
the tracking unit 203 is configured to track the same target in the target video to obtain a tracking track;
and the calculating unit 204 is used for calculating the number of the tracking tracks to obtain the people counting result in the target video.
As a further improvement, the preset head and shoulder detection model comprises: the characteristic diagram reducing module and the multi-scale receptive field expanding module are connected with the characteristic diagram reducing module;
correspondingly, the output unit 201 is specifically configured to:
and sequentially inputting each frame of image in the acquired target video into a preset head and shoulder detection model, so that a feature map reduction module performs feature extraction on the input image and reduces the size of the extracted feature map, a multi-scale receptive field expansion module performs multi-scale processing on the reduced feature map, performs head and shoulder detection frame prediction based on the extracted multi-scale features, and outputs a head and shoulder detection frame of each frame of image.
As a further improvement, the matching unit 202 is specifically configured to:
calculating the intersection ratio between each head and shoulder detection frame in two continuous frames of images, and when the maximum intersection ratio is greater than a preset threshold value, successfully matching the two corresponding head and shoulder detection frames by the maximum intersection ratio;
and judging that the two head and shoulder detection frames which are successfully matched are the same target.
The embodiment of the application also provides crowd counting equipment, which comprises a processor and a memory;
the memory is used for storing the program codes and transmitting the program codes to the processor;
the processor is adapted to perform the people counting method of the aforementioned embodiments of the people counting method according to instructions in the program code.
An embodiment of the present application further provides a computer-readable storage medium, which is used for storing program codes, and the program codes are used for executing the crowd counting method in the aforementioned crowd counting method embodiment.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for executing all or part of the steps of the method described in the embodiments of the present application through a computer device (which may be a personal computer, a server, or a network device). And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. A method of population counting, comprising:
sequentially inputting each frame of image in the acquired target video to a preset head and shoulder detection model for head and shoulder detection, and outputting a head and shoulder detection frame of each frame of image;
matching each head and shoulder detection frame in two continuous frames of the images, and judging that the two successfully matched head and shoulder detection frames are the same target;
tracking the same target in the target video to obtain a tracking track;
and calculating the number of the tracking tracks to obtain the people counting result in the target video.
2. The population counting method of claim 1, wherein said preset head and shoulder detection model comprises: the characteristic diagram reducing module and the multi-scale receptive field expanding module are connected with the characteristic diagram reducing module;
correspondingly, the head and shoulder detection frame for sequentially inputting each frame of image in the acquired target video to a preset head and shoulder detection model for head and shoulder detection and outputting each frame of image comprises:
sequentially inputting each frame of image in the acquired target video to a preset head and shoulder detection model, enabling the feature map reduction module to perform feature extraction on the input image and reduce the size of the extracted feature map, performing multi-scale processing on the reduced feature map by the multi-scale receptive field expansion module, performing head and shoulder detection frame prediction based on the extracted multi-scale features, and outputting a head and shoulder detection frame of each frame of image.
3. The population counting method of claim 2, wherein said profile reduction module comprises: a first convolutional layer, a second convolutional layer, a third convolutional layer and a fourth convolutional layer;
wherein the convolution kernel size of the first convolution layer is 7 × 7, and the convolution kernel size of the second convolution layer, the third convolution layer, and the fourth convolution layer is 3 × 3.
4. The population counting method of claim 2, wherein said multi-scale receptive field expansion module comprises: an inclusion layer, a convolution layer and 3 prediction layers.
5. The people counting method according to claim 1, wherein the matching each of the head and shoulder detection frames in two consecutive frames of the image and determining that two successfully matched head and shoulder detection frames are the same target comprises:
calculating the intersection ratio between each head and shoulder detection frame in two continuous frames of the images, and when the maximum intersection ratio is greater than a preset threshold value, successfully matching the two corresponding head and shoulder detection frames by the maximum intersection ratio;
and judging that the two head and shoulder detection frames which are successfully matched are the same target.
6. A people counting device, comprising:
the output unit is used for sequentially inputting each frame of image in the acquired target video into a preset head and shoulder detection model for head and shoulder detection and outputting a head and shoulder detection frame of each frame of image;
the matching unit is used for matching each head and shoulder detection frame in the two continuous frames of images and judging that the two successfully matched head and shoulder detection frames are the same target;
the tracking unit is used for tracking the same target in the target video to obtain a tracking track;
and the calculating unit is used for calculating the number of the tracking tracks to obtain the people counting result in the target video.
7. The people counting device according to claim 6, wherein the preset head and shoulder detection model comprises: the characteristic diagram reducing module and the multi-scale receptive field expanding module are connected with the characteristic diagram reducing module;
correspondingly, the output unit is specifically configured to:
sequentially inputting each frame of image in the acquired target video to a preset head and shoulder detection model, enabling the feature map reduction module to perform feature extraction on the input image and reduce the size of the extracted feature map, performing multi-scale processing on the reduced feature map by the multi-scale receptive field expansion module, performing head and shoulder detection frame prediction based on the extracted multi-scale features, and outputting a head and shoulder detection frame of each frame of image.
8. The people counting device according to claim 6, wherein the matching unit is specifically configured to:
calculating the intersection ratio between each head and shoulder detection frame in two continuous frames of the images, and when the maximum intersection ratio is greater than a preset threshold value, successfully matching the two corresponding head and shoulder detection frames by the maximum intersection ratio;
and judging that the two head and shoulder detection frames which are successfully matched are the same target.
9. A people counting device, characterized in that the device comprises a processor and a memory;
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the people counting method of any of claims 1-5 according to instructions in the program code.
10. A computer-readable storage medium for storing program code for performing the people counting method according to any one of claims 1-5.
CN202011254152.6A 2020-11-11 2020-11-11 Crowd counting method, device, equipment and storage medium Pending CN112380960A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011254152.6A CN112380960A (en) 2020-11-11 2020-11-11 Crowd counting method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011254152.6A CN112380960A (en) 2020-11-11 2020-11-11 Crowd counting method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112380960A true CN112380960A (en) 2021-02-19

Family

ID=74582675

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011254152.6A Pending CN112380960A (en) 2020-11-11 2020-11-11 Crowd counting method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112380960A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113033353A (en) * 2021-03-11 2021-06-25 北京文安智能技术股份有限公司 Pedestrian trajectory generation method based on overlook image, storage medium and electronic device
CN113128430A (en) * 2021-04-25 2021-07-16 科大讯飞股份有限公司 Crowd gathering detection method and device, electronic equipment and storage medium
CN113988111A (en) * 2021-12-03 2022-01-28 深圳佑驾创新科技有限公司 Statistical method for pedestrian flow of public place and computer readable storage medium
CN114119648A (en) * 2021-11-12 2022-03-01 史缔纳农业科技(广东)有限公司 Pig counting method for fixed channel
CN114463378A (en) * 2021-12-27 2022-05-10 浙江大华技术股份有限公司 Target tracking method, electronic device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284670A (en) * 2018-08-01 2019-01-29 清华大学 A kind of pedestrian detection method and device based on multiple dimensioned attention mechanism
CN109697499A (en) * 2017-10-24 2019-04-30 北京京东尚科信息技术有限公司 Pedestrian's flow funnel generation method and device, storage medium, electronic equipment
CN110738160A (en) * 2019-10-12 2020-01-31 成都考拉悠然科技有限公司 human face quality evaluation method combining with human face detection
CN111611878A (en) * 2020-04-30 2020-09-01 杭州电子科技大学 Method for crowd counting and future people flow prediction based on video image

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109697499A (en) * 2017-10-24 2019-04-30 北京京东尚科信息技术有限公司 Pedestrian's flow funnel generation method and device, storage medium, electronic equipment
CN109284670A (en) * 2018-08-01 2019-01-29 清华大学 A kind of pedestrian detection method and device based on multiple dimensioned attention mechanism
CN110738160A (en) * 2019-10-12 2020-01-31 成都考拉悠然科技有限公司 human face quality evaluation method combining with human face detection
CN111611878A (en) * 2020-04-30 2020-09-01 杭州电子科技大学 Method for crowd counting and future people flow prediction based on video image

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113033353A (en) * 2021-03-11 2021-06-25 北京文安智能技术股份有限公司 Pedestrian trajectory generation method based on overlook image, storage medium and electronic device
CN113128430A (en) * 2021-04-25 2021-07-16 科大讯飞股份有限公司 Crowd gathering detection method and device, electronic equipment and storage medium
CN113128430B (en) * 2021-04-25 2024-06-04 科大讯飞股份有限公司 Crowd gathering detection method, device, electronic equipment and storage medium
CN114119648A (en) * 2021-11-12 2022-03-01 史缔纳农业科技(广东)有限公司 Pig counting method for fixed channel
CN113988111A (en) * 2021-12-03 2022-01-28 深圳佑驾创新科技有限公司 Statistical method for pedestrian flow of public place and computer readable storage medium
CN114463378A (en) * 2021-12-27 2022-05-10 浙江大华技术股份有限公司 Target tracking method, electronic device and storage medium
CN114463378B (en) * 2021-12-27 2023-02-24 浙江大华技术股份有限公司 Target tracking method, electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN112380960A (en) Crowd counting method, device, equipment and storage medium
CN110942009B (en) Fall detection method and system based on space-time hybrid convolutional network
CN109272509B (en) Target detection method, device and equipment for continuous images and storage medium
Haines et al. Background subtraction with dirichlet processes
CN111539290B (en) Video motion recognition method and device, electronic equipment and storage medium
CN110245579B (en) People flow density prediction method and device, computer equipment and readable medium
CN103929685A (en) Video abstract generating and indexing method
CN111104925B (en) Image processing method, image processing apparatus, storage medium, and electronic device
CN107563299B (en) Pedestrian detection method using RecNN to fuse context information
CN109446967B (en) Face detection method and system based on compressed information
CN110633643A (en) Abnormal behavior detection method and system for smart community
CN111652181B (en) Target tracking method and device and electronic equipment
CN112016461A (en) Multi-target behavior identification method and system
CN109697393B (en) Person tracking method, person tracking device, electronic device, and computer-readable medium
CN114926791A (en) Method and device for detecting abnormal lane change of vehicles at intersection, storage medium and electronic equipment
CN111950507B (en) Data processing and model training method, device, equipment and medium
CN110956097A (en) Method and module for extracting occluded human body and method and device for scene conversion
CN113642442B (en) Face detection method and device, computer readable storage medium and terminal
CN111383245A (en) Video detection method, video detection device and electronic equipment
CN115690732A (en) Multi-target pedestrian tracking method based on fine-grained feature extraction
CN112907623A (en) Statistical method and system for moving object in fixed video stream
CN112966136A (en) Face classification method and device
CN113554685A (en) Method and device for detecting moving target of remote sensing satellite, electronic equipment and storage medium
CN104732558B (en) moving object detection device
CN112598707A (en) Real-time video stream object detection and tracking method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination