WO2021031954A1 - 对象数量确定方法、装置、存储介质与电子设备 - Google Patents

对象数量确定方法、装置、存储介质与电子设备 Download PDF

Info

Publication number
WO2021031954A1
WO2021031954A1 PCT/CN2020/108677 CN2020108677W WO2021031954A1 WO 2021031954 A1 WO2021031954 A1 WO 2021031954A1 CN 2020108677 W CN2020108677 W CN 2020108677W WO 2021031954 A1 WO2021031954 A1 WO 2021031954A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
objects
processed
value
preset threshold
Prior art date
Application number
PCT/CN2020/108677
Other languages
English (en)
French (fr)
Inventor
郁昌存
王德鑫
Original Assignee
北京海益同展信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京海益同展信息科技有限公司 filed Critical 北京海益同展信息科技有限公司
Publication of WO2021031954A1 publication Critical patent/WO2021031954A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects

Definitions

  • the present disclosure relates to the field of computer vision technology, and in particular to a method for determining the number of objects, a device for determining the number of objects, a computer-readable storage medium, and electronic equipment.
  • the traditional method is to count the number of inflow and outflow objects at the entrance and exit of the target area, such as setting gates or infrared sensing equipment at the entrance and exit of the scenic spot, and setting barrier gate equipment at the entrance and exit of the parking lot.
  • this method cannot count the number of objects in the open area. For example, the number of tourists in open scenic spots, the number of vehicles on the street, etc., and only the total number of objects in the target area can be counted, and the distribution of objects cannot be determined.
  • the present disclosure provides a method for determining the number of objects, a device for determining the number of objects, a computer-readable storage medium, and electronic equipment, thereby improving the prior art at least to a certain extent.
  • the density of objects is high, the accuracy of determining the number of objects is low.
  • a method for determining the number of objects including: recognizing objects in an image to be processed, using the number of recognized objects as a first value; comparing the first value with a preset Threshold; if the first value is less than the preset threshold, the number of objects in the image to be processed is determined as the first value; if the first value is greater than the preset threshold, then Density detection is performed on the objects in the image to be processed to obtain a second value related to the number of objects, and the number of objects in the image to be processed is determined as the second value.
  • the method further includes: acquiring a target image, dividing the target image into a plurality of regions, and using an image of each region as the image to be processed.
  • each of the regions has a corresponding preset threshold.
  • the recognizing the object in the image to be processed includes: recognizing the object in the image to be processed through a pre-trained first neural network model.
  • the first neural network model includes the YOLO model (You Only Look Once, an algorithm framework for real-time target detection, including v1, v2, v3, etc.).
  • the present disclosure Either version can be used).
  • the performing density detection on the object in the image to be processed includes: performing density detection on the object in the image to be processed through a pre-trained second neural network model .
  • the second neural network model includes: a first branch network for performing a first convolution process on the image to be processed to obtain a first characteristic image; and a second branch
  • the network is used to perform the second convolution processing on the image to be processed to obtain the second characteristic image
  • the third branch network is used to perform the third convolution processing on the image to be processed to obtain the third characteristic image
  • the layer is used to merge the first feature image, the second feature image, and the third feature image into a final feature image
  • the output layer is used to map the final feature image to a density image.
  • a device for determining the number of objects including: a recognition module for recognizing objects in an image to be processed, and using the number of recognized objects as a first value; and a comparison module using To compare the first value with a preset threshold; a first determining module is configured to determine the number of objects in the image to be processed as the first if the first value is less than the preset threshold A value; a second determination module, configured to perform density detection on objects in the image to be processed if the first value is greater than the preset threshold to obtain a second value about the number of objects, and The number of the objects in the image to be processed is determined as the second value.
  • the device further includes: an acquisition module configured to acquire a target image, divide the target image into a plurality of regions, and use an image of each region as the to-be Process images.
  • each of the regions has a corresponding preset threshold.
  • the recognition module is configured to recognize the object in the image to be processed through a pre-trained first neural network model.
  • the first neural network model includes a YOLO model.
  • the second determination module includes: a density detection unit, configured to perform density detection on the object in the image to be processed through a pre-trained second neural network model.
  • the second neural network model includes: a first branch network for performing a first convolution process on the image to be processed to obtain a first characteristic image; and a second branch
  • the network is used to perform the second convolution processing on the image to be processed to obtain the second characteristic image
  • the third branch network is used to perform the third convolution processing on the image to be processed to obtain the third characteristic image
  • the layer is used to merge the first feature image, the second feature image, and the third feature image into a final feature image
  • the output layer is used to map the final feature image to a density image.
  • a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed by a processor, the method described in any one of the above is implemented.
  • an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to execute the executable instructions Perform any of the methods described above.
  • the object in the image Recognize the object in the image to be processed, and determine whether the object in the image is sparse or dense based on the relationship between the first value obtained by the recognition and the preset threshold, so as to determine whether to use the first value as the final result or the density
  • the second value obtained from the detection is used as the final result.
  • the first value is greater than the preset threshold, the objects in the image are dense and may be blocked.
  • the density detection method is used, and the second value obtained is used as the final structure to determine the number of objects more accurately.
  • the combination of object recognition and density detection has high flexibility. By adjusting the preset threshold, this exemplary embodiment can be applied to various different scenarios, and has high applicability. .
  • Fig. 1 shows a flowchart of a method for determining the number of objects in this exemplary embodiment
  • Figure 2 shows the scenic spot monitoring image to be processed
  • Figure 3 shows a visualized effect diagram of tourist recognition on surveillance images of scenic spots
  • Fig. 4 shows a structure diagram of a neural network model in this exemplary embodiment
  • Fig. 5 shows a schematic diagram of dividing regions of a target image in this exemplary embodiment
  • Fig. 6 shows a flowchart of another method for determining the number of objects in this exemplary embodiment
  • Fig. 7 shows a structural block diagram of a device for determining the number of objects in this exemplary embodiment
  • FIG. 8 shows a computer-readable storage medium for implementing the above method in this exemplary embodiment
  • Fig. 9 shows an electronic device for implementing the above-mentioned method in this exemplary embodiment.
  • Example embodiments will now be described more fully with reference to the accompanying drawings.
  • the example embodiments can be implemented in various forms, and should not be construed as being limited to the examples set forth herein; on the contrary, the provision of these embodiments makes the present disclosure more comprehensive and complete, and fully conveys the concept of the example embodiments To those skilled in the art.
  • the described features, structures or characteristics may be combined in one or more embodiments in any suitable way.
  • Exemplary embodiments of the present disclosure first provide a method for determining the number of objects in an image.
  • the application scenarios of the method include but are not limited to: counting the number of people in areas such as scenic spots and shopping malls; counting vehicles in areas such as parking lots and streets Monitoring the number of ships in ports, docks and other areas; monitoring the number of livestock in the livestock farm.
  • the following takes the scenario of counting the number of people in a scenic spot as an example, and the method content is also applicable to other scenarios.
  • Fig. 1 shows the method flow of this exemplary embodiment, which may include steps S110 to S140:
  • Step S110 Recognize the objects in the image to be processed, and use the number of recognized objects as the first value.
  • the image to be processed may be a surveillance image of a scenic spot or a GIS image (Geographic Information System, where the GIS image includes a satellite view of the ground surface, a population heat map, etc.), etc.
  • GIS image Geographic Information System
  • the GIS image includes a satellite view of the ground surface, a population heat map, etc.
  • pull the video stream of surveillance cameras in the scenic area through a background computer or server.
  • web cameras provide protocols such as rtmp (Real Time Messaging Protocol), http (Hyper Text Transfer Protocol, Hypertext Transfer Protocol), etc.
  • OpenCV Open Source Computer Vision Library
  • a deep learning technology may be used to recognize objects in the image to be processed through a pre-trained first neural network model.
  • the first neural network model can use the YOLO model, the YOLO model can be trained through the open source dense pedestrian detection data set, or the pictures in the application scene can be manually labeled to obtain the data set (for example, from a large number of scenic spot monitoring images Out of tourists) and conduct training.
  • the YOLO model takes the scenic spot monitoring image as input, and the bounding box information of all tourists in the image as output.
  • input Figure 2 into the YOLO model, and the output visualization effect can be referred to as shown in Figure 3.
  • the YOLO model The tourists in the image are identified, and the (x, y, w, h) of the bounding box of each visitor is actually obtained.
  • x and y represent the position coordinates of the center of the bounding box in the image
  • w and h represent the width of the bounding box.
  • the first neural network model can also use R-CNN (Region-Convolutional Neural Network, or improved versions of Fast R-CNN, Faster R-CNN, etc.), SSD (Single Shot MultiBox Detector, single-step Multi-frame target detection) and other target detection algorithm models.
  • the contour of an object may also be detected from the image to be processed, and an object whose contour shape is close to the shape of the object is recognized as an object.
  • the number of objects recognized from the image to be processed is the first value.
  • Step S120 comparing the first value with a preset threshold.
  • the first value obtained in step S110 is close to the true number of objects, that is, the reliability of the first value is higher. ; In the case of a large number of objects, there may be a problem that several objects are occluded, or the image resolution of a single object is low, making the object difficult to identify, and the reliability of the first value is low.
  • the first neural network model is used to identify the tourists in the surveillance image, and there are many cases of missed detection in the central area with dense tourists.
  • the first value is credible by comparing the relative size of the first value and the preset threshold. If the first value is less than the preset threshold, the objects in the image to be processed are relatively sparse, and the first value is credible ; On the contrary, the objects in the image to be processed are relatively dense, and the first value is unreliable.
  • the preset threshold may be determined according to experience, the characteristics of the region corresponding to the image to be processed, the size relationship between the image to be processed and the object, etc., which is not particularly limited in the present disclosure.
  • step S130 if the first value is less than the preset threshold, the number of objects in the image to be processed is determined as the first value.
  • step S130 when the condition of step S130 is satisfied, the first value is credible, so it can be used as the number of objects in the image to be processed, and the result is output.
  • Step S140 if the first value is greater than the preset threshold, density detection is performed on objects in the image to be processed to obtain a second value regarding the number of objects, and the number of objects in the image to be processed is determined as the second value.
  • step S140 When the condition of step S140 is met, and the first value is not credible, another method other than object recognition may be used for processing, that is, the method of density detection, to determine the number of objects in the image to be processed.
  • Density detection is different from object recognition. It mainly regresses the probability of objects in each region (or each pixel) in the image to be processed, and obtains the number of objects in the image to be processed in a statistical manner, which is the second value mentioned above. In the case of many objects, especially in the case of dense distribution and occlusion, density detection has higher credibility than object recognition. Therefore, the second value can be used as the number of objects in the image to be processed, and the result can be output .
  • the case where the first value is equal to the preset threshold can be regarded as a special case meeting the condition of step S130, or as a special case meeting the condition of step S140, so that step S130 or S140 is executed. There is no particular limitation on this.
  • the density detection of the object in the image to be processed may be performed through a pre-trained second neural network model.
  • the second neural network model may adopt the MCNN model (Multi-column Convolutional Neural Network, multi-column convolutional neural network).
  • the MCNN model 400 shows a structure of the MCNN model 400, which may include: an input layer 410 for inputting Process the image; the first branch network 420 is used to perform the first convolution process on the image to be processed to obtain the first feature image; the second branch network 430 is used to perform the second convolution process on the image to be processed to obtain the second feature image The third branch network 440 is used to perform the third convolution processing on the image to be processed to obtain the third feature image; the merging layer 450 is used to merge the first feature image, the second feature image and the third feature image into the final feature Image; output layer 460, used to map the final feature image to a density image.
  • the first convolution processing, the second convolution processing and the third convolution processing respectively include a series of operations such as convolution and pooling.
  • the parameters used For example, the size of the convolution kernel, pooling parameters, etc.
  • the density image the value of each point represents the probability that the point is an object.
  • the values of all points are accumulated to obtain the second value representing the number of objects in the image to be processed.
  • the training of the MCNN model can be based on open source data sets.
  • the image annotation can be the coordinates of each human head.
  • the geometric adaptive Gaussian kernel is used to convert the human head coordinates into a probability density image.
  • the sum of the probabilities of each human head region is 1.
  • the second neural network model can also use other density detection networks, such as a variant of MCNN.
  • a fourth branch network is added, or in the first, second, or third branch.
  • An intermediate layer is added to the network, or one or more fully connected layers are added, and the present disclosure does not specifically limit this.
  • this exemplary embodiment recognizes the object in the image to be processed, and determines whether the object in the image is sparse or dense according to the relationship between the first value obtained by the recognition and the preset threshold, so as to determine whether to use the first value.
  • One value is used as the final result, and the second value obtained by density detection is used as the final result.
  • the first value is greater than the preset threshold, the objects in the image are dense and may be blocked. In this case, the density detection method is used, and the second value obtained is used as the final structure to determine the number of objects more accurately. , So that the exemplary embodiment has a higher accuracy.
  • the combination of object recognition and density detection has high flexibility. By adjusting the preset threshold, this exemplary embodiment can be applied to various different scenarios, and has high applicability. .
  • the target image after acquiring the target image, the target image may be divided into multiple regions, and the images of each region may be used as the image to be processed.
  • the target image is a complete image that needs to determine the number of objects.
  • the captured image contains part of the fixed scene, sky, etc. Inside, there are more interference factors, which will cause a certain disturbance to the number of tourists, and the distribution of tourists in different regions also has a difference between dense and sparse, which can be dealt with separately.
  • Figure 2 can be divided into multiple regions based on prior experience, and the method process of Figure 1 is performed on each region image. Finally, the number of objects in each region is added to obtain the target image The total number of objects in.
  • the preset threshold used can be the same or different, that is, each area can have a unified preset threshold.
  • the thresholds may also have corresponding preset thresholds. For example, in FIG. 5, a smaller preset threshold can be set for areas two and three, and a larger preset threshold for area four.
  • the preset threshold for each area can be determined based on experience or calculated based on image features.
  • calculate the area of a part of the image where tourists may appear in each area divide it by the image area occupied by each tourist, and estimate the number of tourists in each area
  • the number of tourists when it is full and there is no occlusion can be used as the preset threshold, or multiplied by an empirical coefficient less than 1 (such as 0.9) as the preset threshold, etc.
  • the present disclosure does not make special limited. Using targeted preset thresholds for each region can more accurately obtain the total number of objects in the target image.
  • Fig. 6 shows another process of this exemplary embodiment, including: step S601, acquiring a target image, for example, a surveillance image; step S602, dividing the target image into multiple regions; step S603, using the The image is an image to be processed, and steps S604 to S608 are performed respectively: step S604, through object recognition, detects the number of objects in the image to be processed, and is the first value; step S605, judges the size of the first value and the preset threshold; step S606 If the first value is less than the preset threshold, it is determined that the number of objects in the area is the first value; step S607, if the first value is greater than the preset threshold, the first value is not credible, and the object density of the image to be processed needs to be checked.
  • step S608 determine the number of objects in the area as the second value; based on the above process, the number of objects in each area can be obtained, and finally step S609 is performed to accumulate the number of objects in each area to obtain the target image The total number of objects in the target image, which ultimately determines the number of objects in the target image.
  • the device 700 may include: a recognition module 710 for recognizing objects in the image to be processed, The number is used as the first value; the comparison module 720 is used to compare the first value with a preset threshold; the first determination module 730 is used to determine the number of objects in the image to be processed as the first value if the first value is less than the preset threshold. A value; a second determination module 740, configured to, if the first value is greater than a preset threshold, perform density detection on objects in the image to be processed to obtain a second value on the number of objects, and determine the number of objects in the image to be processed Is the second value.
  • the device 700 for determining the number of objects may further include: an acquisition module (not shown in the figure), configured to acquire a target image, divide the target image into a plurality of regions, and use the image of each region as the target image. Process images.
  • each of the foregoing regions has a corresponding preset threshold.
  • the recognition module 710 may be used to recognize an object in the image to be processed through a pre-trained first neural network model.
  • the first neural network model may be a YOLO model.
  • the second determination module 740 may include: a density detection unit (not shown in the figure), configured to perform density detection on objects in the image to be processed through a pre-trained second neural network model.
  • the second neural network model may include: a first branch network for performing a first convolution process on the image to be processed to obtain a first feature image; a second branch network for performing a first feature image on the image to be processed The second convolution process is used to obtain the second feature image; the third branch network is used to perform the third convolution process on the image to be processed to obtain the third feature image; the merge layer is used to combine the first feature image and the second feature image And the third feature image are merged into the final feature image; the output layer is used to map the final feature image into a density image.
  • Exemplary embodiments of the present disclosure also provide a computer-readable storage medium on which is stored a program product capable of implementing the above method of this specification.
  • various aspects of the present disclosure can also be implemented in the form of a program product, which includes program code.
  • the program product runs on a terminal device, the program code is used to make the terminal device execute the above-mentioned instructions in this specification.
  • the steps according to various exemplary embodiments of the present disclosure are described in the "Exemplary Methods" section.
  • a program product 800 for implementing the above method according to an exemplary embodiment of the present disclosure is described. It can adopt a portable compact disk read-only memory (CD-ROM) and include program codes, and can be used in a terminal Running on equipment, such as a personal computer.
  • CD-ROM compact disk read-only memory
  • the program product of the present disclosure is not limited thereto.
  • the readable storage medium can be any tangible medium that contains or stores a program, and the program can be used by or in combination with an instruction execution system, device, or device.
  • the program product can adopt any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Type programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the computer-readable signal medium may include a data signal propagated in baseband or as a part of a carrier wave, and readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the readable signal medium may also be any readable medium other than a readable storage medium, and the readable medium may send, propagate, or transmit a program for use by or in combination with the instruction execution system, apparatus, or device.
  • the program code contained on the readable medium can be transmitted by any suitable medium, including but not limited to wireless, wired, optical cable, RF, etc., or any suitable combination of the foregoing.
  • the program code for performing the operations of the present disclosure can be written in any combination of one or more programming languages.
  • the programming languages include object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming. Language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computing device, partly on the user's device, executed as an independent software package, partly on the user's computing device and partly executed on the remote computing device, or entirely on the remote computing device or server Executed on.
  • the remote computing device can be connected to a user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or can be connected to an external computing device (for example, using Internet service providers) Business to connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • Internet service providers Internet service providers
  • Exemplary embodiments of the present disclosure also provide an electronic device capable of implementing the above method.
  • the electronic device 900 according to this exemplary embodiment of the present disclosure will be described below with reference to FIG. 9.
  • the electronic device 900 shown in FIG. 9 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present disclosure.
  • the electronic device 900 may be in the form of a general-purpose computing device.
  • the components of the electronic device 900 may include, but are not limited to: the aforementioned at least one processing unit 910, the aforementioned at least one storage unit 920, a bus 930 connecting different system components (including the storage unit 920 and the processing unit 910), and a display unit 940.
  • the storage unit 920 stores program codes, and the program codes can be executed by the processing unit 910 so that the processing unit 910 executes the steps according to various exemplary embodiments of the present disclosure described in the "Exemplary Method" section of this specification.
  • the processing unit 910 may execute the method steps shown in FIG. 4 or FIG. 5 and the like.
  • the storage unit 920 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 921 and/or a cache storage unit 922, and may further include a read-only storage unit (ROM) 923.
  • RAM random access storage unit
  • ROM read-only storage unit
  • the storage unit 920 may also include a program/utility tool 924 having a set of (at least one) program module 925.
  • program module 925 includes but is not limited to: an operating system, one or more application programs, other program modules, and program data, Each of these examples or some combination may include the implementation of a network environment.
  • the bus 930 may represent one or more of several types of bus structures, including a storage unit bus or a storage unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any bus structure among multiple bus structures. bus.
  • the electronic device 900 may also communicate with one or more external devices 1000 (such as keyboards, pointing devices, Bluetooth devices, etc.), and may also communicate with one or more devices that enable users to interact with the electronic device 900, and/or communicate with Any device (eg, router, modem, etc.) that enables the electronic device 900 to communicate with one or more other computing devices. This communication can be performed through an input/output (I/O) interface 950.
  • the electronic device 900 may also communicate with one or more networks (for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through the network adapter 960.
  • networks for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet
  • the network adapter 960 communicates with other modules of the electronic device 900 through the bus 930. It should be understood that although not shown in the figure, other hardware and/or software modules can be used in conjunction with the electronic device 900, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives And data backup storage system, etc.
  • the exemplary embodiments described herein can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) execute the method according to the exemplary embodiment of the present disclosure.
  • a computing device which may be a personal computer, a server, a terminal device, or a network device, etc.
  • modules or units of the device for action execution are mentioned in the above detailed description, this division is not mandatory.
  • the features and functions of two or more modules or units described above may be embodied in one module or unit.
  • the features and functions of a module or unit described above can be further divided into multiple modules or units to be embodied.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本公开提供了一种对象数量确定方法、对象数量确定装置、计算机可读存储介质与电子设备,属于计算机视觉技术领域。该方法包括:对待处理图像中的对象进行识别,将识别到的所述对象的数量作为第一数值;比较所述第一数值和预设阈值;如果所述第一数值小于所述预设阈值,则将所述待处理图像中所述对象的数量确定为所述第一数值;如果所述第一数值大于所述预设阈值,则对所述待处理图像中的对象进行密度检测,得到关于所述对象的数量的第二数值,并将所述待处理图像中所述对象的数量确定为所述第二数值。本公开能够在对象分布密集的情况下,较为准确的确定对象数量,并具有较高的适用性。

Description

对象数量确定方法、装置、存储介质与电子设备
本申请要求于2019年8月20日提交的,申请号为201910769944.8,名称为“对象数量确定方法、装置、存储介质与电子设备”的中国专利申请的优先权,该中国专利申请的全部内容通过引用结合在本文中。
技术领域
本公开涉及计算机视觉技术领域,尤其涉及一种对象数量确定方法、对象数量确定装置、计算机可读存储介质与电子设备。
背景技术
在很多场合中,都需要统计某种对象的数量,例如统计景区的游客数量,统计停车场的车辆数量等。
传统的方法是在目标区域的出入口统计流入和流出的对象数量,例如景区出入口设置闸机或红外感测设备,停车场出入口设置道闸设备等,但是这种方法无法统计开放区域的对象数量,例如开放性景区的游客数量,街道的车辆数量等,并且只能统计目标区域内的对象总数,无法确定对象的分布情况。
随着深度学习与计算机视觉的发展,现有技术中出现了基于监控图像确定对象数量的方法,以统计景区的游客数量为例,在景区的不同位置设置监控摄像头,实时拍摄景区图像,从图像中识别游客,从而统计出游客数量。相比于上述传统方法,现有技术有了明显改进,其能够应用于开放区域,且能够统计区域内的对象分布情况;然而也存在一定的问题,在对象密度较大,特别是存在遮挡的情况下,例如节假日人流高峰期的景区、上下班车流高峰期的街道路段等,现有技术的准确度较低,其所确定的对象数量与实际数量差别较大,通常低于实际数量,从而限制了其应用。
需要说明的是,在上述背景技术部分公开的信息仅用于加强对本公开的背景的理解,因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。
发明内容
本公开提供了一种对象数量确定方法、对象数量确定装置、计算机可读存储介质与电子设备,进而至少在一定程度上改善现有技术在对象密度较大时,确定对象数量的准确度较低的问题。
本公开的其他特性和优点将通过下面的详细描述变得显然,或部分地通过本公开的实践而习得。
根据本公开的第一方面,提供一种对象数量确定方法,包括:对待处理图像中的 对象进行识别,将识别到的所述对象的数量作为第一数值;比较所述第一数值和预设阈值;如果所述第一数值小于所述预设阈值,则将所述待处理图像中所述对象的数量确定为所述第一数值;如果所述第一数值大于所述预设阈值,则对所述待处理图像中的对象进行密度检测,得到关于所述对象数量的第二数值,并将所述待处理图像中所述对象的数量确定为所述第二数值。
在本公开的一种示例性实施例中,所述方法还包括:获取目标图像,将所述目标图像划分为多个区域,分别以各所述区域的图像作为所述待处理图像。
在本公开的一种示例性实施例中,各所述区域具有对应的预设阈值。
在本公开的一种示例性实施例中,所述对待处理图像中的对象进行识别,包括:通过预先训练的第一神经网络模型对所述待处理图像中的对象进行识别。
在本公开的一种示例性实施例中,所述第一神经网络模型包括YOLO模型(You Only Look Once,一种实时目标检测的算法框架,包括v1、v2、v3等多个版本,本公开可以采用其中任一个版本)。
在本公开的一种示例性实施例中,所述对所述待处理图像中的对象进行密度检测,包括:通过预先训练的第二神经网络模型对所述待处理图像中的对象进行密度检测。
在本公开的一种示例性实施例中,所述第二神经网络模型包括:第一分支网络,用于对所述待处理图像进行第一卷积处理,得到第一特征图像;第二分支网络,用于对所述待处理图像进行第二卷积处理,得到第二特征图像;第三分支网络,用于对所述待处理图像进行第三卷积处理,得到第三特征图像;合并层,用于将所述第一特征图像、第二特征图像和第三特征图像合并为最终特征图像;输出层,用于将所述最终特征图像映射为密度图像。
根据本公开的第二方面,提供一种对象数量确定装置,包括:识别模块,用于对待处理图像中的对象进行识别,将识别到的所述对象的数量作为第一数值;比较模块,用于比较所述第一数值和预设阈值;第一确定模块,用于如果所述第一数值小于所述预设阈值,则将所述待处理图像中所述对象的数量确定为所述第一数值;第二确定模块,用于如果所述第一数值大于所述预设阈值,则对所述待处理图像中的对象进行密度检测,得到关于所述对象数量的第二数值,并将所述待处理图像中所述对象的数量确定为所述第二数值。
在本公开的一种示例性实施例中,所述装置还包括:获取模块,用于获取目标图像,将所述目标图像划分为多个区域,分别以各所述区域的图像作为所述待处理图像。
在本公开的一种示例性实施例中,各所述区域具有对应的预设阈值。
在本公开的一种示例性实施例中,所述识别模块,用于通过预先训练的第一神经网络模型对所述待处理图像中的对象进行识别。
在本公开的一种示例性实施例中,所述第一神经网络模型包括YOLO模型。
在本公开的一种示例性实施例中,所述第二确定模块包括:密度检测单元,用于 通过预先训练的第二神经网络模型对所述待处理图像中的对象进行密度检测。
在本公开的一种示例性实施例中,所述第二神经网络模型包括:第一分支网络,用于对所述待处理图像进行第一卷积处理,得到第一特征图像;第二分支网络,用于对所述待处理图像进行第二卷积处理,得到第二特征图像;第三分支网络,用于对所述待处理图像进行第三卷积处理,得到第三特征图像;合并层,用于将所述第一特征图像、第二特征图像和第三特征图像合并为最终特征图像;输出层,用于将所述最终特征图像映射为密度图像。
根据本公开的第三方面,提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述任意一项所述的方法。
根据本公开的第四方面,提供一种电子设备,包括:处理器;以及存储器,用于存储所述处理器的可执行指令;其中,所述处理器配置为经由执行所述可执行指令来执行上述任意一项所述的方法。
本公开的示例性实施例具有以下有益效果:
对待处理图像中的对象进行识别,根据识别得到的第一数值与预设阈值的大小关系,判断图像中的对象为稀疏还是密集的情况,从而确定是采用第一数值作为最终结果,还是采用密度检测得到的第二数值作为最终结果。一方面,如果第一数值大于预设阈值,则图像中的对象密集,可能存在遮挡的情况,此时采用密度检测的方式,将得到的第二数值作为最终结构,能够较为准确的确定对象数量,使得本示例性实施例具有较高的准确度。另一方面,采用对象识别和密度检测两种方式的结合,具有较高的灵活性,通过调整预设阈值,可以使本示例性实施例应用于各种不同的场景,具有较高的适用性。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出本示例性实施例中一种对象数量确定方法的流程图;
图2示出待处理的景区监控图像;
图3示出对景区监控图像进行游客识别的可视化效果图;
图4示出本示例性实施例中一种神经网络模型的结构图;
图5示出本示例性实施例中对目标图像划分区域的示意图;
图6示出本示例性实施例中另一种对象数量确定方法的流程图;
图7示出本示例性实施例中一种对象数量确定装置的结构框图;
图8示出本示例性实施例中一种用于实现上述方法的计算机可读存储介质;
图9示出本示例性实施例中一种用于实现上述方法的电子设备。
具体实施方式
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本公开将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施方式中。
本公开的示例性实施例首先提供了一种图像中确定对象数量的方法,该方法的应用场景包括但不限于:统计景区、商场等区域内的人数;统计停车场、街道等区域内的车辆数;监测港口、码头等区域内的船舶数;监测畜牧场内的牲畜数。下面以统计景区人数的场景为例进行说明,其方法内容对于其他场景同样适用。
图1示出了本示例性实施例的方法流程,可以包括步骤S110~S140:
步骤S110,对待处理图像中的对象进行识别,将识别到的对象的数量作为第一数值。
其中,待处理图像可以是景区的监控图像或GIS图像(Geographic Information System,地理信息***,GIS图像包括地表的卫星视图、人口热力图等)等。例如:通过后台计算机或服务器拉取景区内监控摄像头的视频流,目前网络摄像头都会提供rtmp(Real Time Messaging Protocol,实时消息传输协议)、http(Hyper Text Transfer Protocol,超文本传输协议)等协议的视频流,可以通过OpenCV(Open Source Computer Vision Library,开源计算机视觉库)拉取在线视频流,获取实时的视频帧,以其中的单帧图像作为待处理图像,如图2示出了某景区的单帧监控图像。
在获取待处理图像后,可以对其中的对象进行识别。在一示例性实施例中,可以采用深度学习技术,通过预先训练的第一神经网络模型对待处理图像中的对象进行识别。例如第一神经网络模型可以采用YOLO模型,可以通过开源的密集行人检测数据集对YOLO模型进行训练,也可以人工对应用场景中的图片进行标注以得到数据集(例如从大量景区监控图像中标注出游客)并进行训练。YOLO模型以景区监控图像为输入,以图像中所有游客的包围盒(Bounding Box)信息为输出,例如将图2输入到YOLO模型中,其输出的可视化效果可以参考图3所示,YOLO模型对图像中的游客进行识别,最终实际得到每个游客的包围盒的(x,y,w,h),x和y表示包围盒的中心在图像中的位置坐标,w和h表示包围盒的宽和高。此外,第一神经网络模型也可以采用R-CNN(Region-Convolutional Neural Network,区域卷积神经网络,或Fast R-CNN、Faster R-CNN等改进版本)、SSD(Single Shot MultiBox Detector,单步多框目标检测)等其他目标检测的算法模型。在一示例性实施例中,也可以从待处理图像中检测物体 轮廓,将轮廓形状接近于对象形状的物体识别为对象。
本示例性实施例中,从待处理图像中识别出的对象数量为第一数值。
步骤S120,比较第一数值和预设阈值。
通常在待处理图像中对象较少的情况下,每个对象在图像中较为完整,易于识别,因此步骤S110得到的第一数值接近于对象的真实数量,即第一数值的可信度较高;在对象较多的情况下,可能存在若干对象被遮挡,或者单个对象的图像分辨率较低的问题,使得对象难于识别,则第一数值的可信度较低。如上述图2和图3所示,当景区内游客较多时,通过第一神经网络模型识别监控图像中的游客,在游客密集的中心区域存在较多漏检的情况。
本示例性实施例中,通过比较第一数值和预设阈值的相对大小确定第一数值是否可信,如果第一数值小于预设阈值,则待处理图像中对象相对稀疏,第一数值可信;反之,则待处理图像中对象相对密集,第一数值不可信。其中,预设阈值可以根据经验、待处理图像对应的区域特征、待处理图像和对象的尺寸关系等确定,本公开对此不做特别限定。
步骤S130,如果第一数值小于预设阈值,则将待处理图像中对象的数量确定为第一数值。
由上可知,满足步骤S130的条件时,第一数值可信,因此可以将其作为待处理图像中对象的数量,输出该结果。
步骤S140,如果第一数值大于预设阈值,则对待处理图像中的对象进行密度检测,得到关于对象数量的第二数值,并将待处理图像中对象的数量确定为第二数值。
满足步骤S140的条件时,第一数值不可信,则可以采用对象识别以外的另一种方式进行处理,即密度检测的方式,以确定待处理图像中对象的数量。密度检测与对象识别不同,其主要是对待处理图像中每个区域(或每个像素)内存在对象的概率进行回归,以统计的方式得到待处理图像中对象的数量,为上述第二数值。在对象较多的情况下,特别是分布密集、存在遮挡的情况下,密度检测比对象识别具有更高的可信度,因此可以将第二数值作为待处理图像中对象的数量,输出该结果。
需要补充的是,对于第一数值等于预设阈值的情况,可将其视为满足步骤S130条件的特殊情况,也可视为满足步骤S140条件的特殊情况,从而执行步骤S130或S140,本公开对此不做特别限定。
在一示例性实施例中,可以通过预先训练的第二神经网络模型对待处理图像中的对象进行密度检测。例如第二神经网络模型可以采用MCNN模型(Multi-column Convolutional Neural Network,多列卷积神经网络),图4示出了MCNN模型400的一种结构,可以包括:输入层410,用于输入待处理图像;第一分支网络420,用于对待处理图像进行第一卷积处理,得到第一特征图像;第二分支网络430,用于对待处理图像进行第二卷积处理,得到第二特征图像;第三分支网络440,用于对待处理图像进行 第三卷积处理,得到第三特征图像;合并层450,用于将第一特征图像、第二特征图像和第三特征图像合并为最终特征图像;输出层460,用于将最终特征图像映射为密度图像。其中,第一卷积处理、第二卷积处理和第三卷积处理分别包括一系列卷积、池化等操作,在第一、第二和第三卷积处理中,所使用的参数(如卷积核尺寸、池化参数等)不同,相当于从不同尺度上提取待处理图像的特征,分别得到第一、第二和第三特征图像;然后合并为最终特征图像,再通过1*1卷积等方式映射为密度图像,在密度图像中,每个点的数值代表该点为对象的概率,将所有点的数值累加,即得到表示待处理图像中对象数量的第二数值。
MCNN模型的训练可以基于开源数据集,图像标注可以是每个人头的坐标,使用几何自适应高斯核将人头坐标转换为概率密度图像,每个人头区域的概率之和为1。采用初始图像为样本,转换后的概率密度图像为标记(ground truth),可以对模型进行训练。
应当理解,第二神经网络模型也可以采用其他密度检测的网络,例如MCNN的变体形式,在图4结构的基础上,增加第四分支网络等,或者在第一、第二或第三分支网络中增加中间层,或者增加一个或多个全连接层等,本公开对此不做特别限定。
基于上述说明,本示例性实施例对待处理图像中的对象进行识别,根据识别得到的第一数值与预设阈值的大小关系,判断图像中的对象为稀疏还是密集的情况,从而确定是采用第一数值作为最终结果,还是采用密度检测得到的第二数值作为最终结果。一方面,如果第一数值大于预设阈值,则图像中的对象密集,可能存在遮挡的情况,此时采用密度检测的方式,将得到的第二数值作为最终结构,能够较为准确的确定对象数量,使得本示例性实施例具有较高的准确度。另一方面,采用对象识别和密度检测两种方式的结合,具有较高的灵活性,通过调整预设阈值,可以使本示例性实施例应用于各种不同的场景,具有较高的适用性。
在一示例性实施例中,可以在获取目标图像后,将目标图像划分为多个区域,分别以各区域的图像作为待处理图像。其中,目标图像是需要确定对象数量的完整图像,例如图2所示的景区的原始监控图像,由于摄像头安置角度较高,拍摄范围较广,所拍摄的图像将部分固定景物、天空等包含在内,产生较多的干扰因素,对游客数量估计产生一定的干扰,而且不同区域的游客分布也存在密集与稀疏的差别,可以针对性地分别处理。鉴于此,参考图5所示,可以根据先验经验将图2划分为多个区域,对每个区域图像分别执行图1的方法流程,最后将每个区域的对象数量相加,得到目标图像中的对象总数。
在图5中,区域一不可能存在游客,所以可以将区域一的游客数量始终置为0。区域二和区域三中游客相对稀疏,固定景物占比较大,因此可以采用对象识别的方式识别游客,统计数量。区域四是游客主要集中的区域,较为密集,且存在较严重的遮挡情况,对象识别针对该区域的效果较差,因此可以采用密度检测的方式统计游客数量。
除了根据先验经验划分区域外,还可以采用其他方式,下面提供几个示例性方式,但下述方式不应对本公开的保护范围造成限定:
(1)根据目标图像中对象分布的特征划分区域:首先对目标图像进行对象识别,得到各对象的大致位置;然后大致选取对象分布较为密集的部分,在两个对象之间相距超过一定距离的位置划出边界线,得到一个区域,计算该区域的对象密度(该区域的对象数量/该区域的图像面积);再逐渐向各个方向扩展该区域,如果扩展后对象密度增加,则以扩展后的区域代替扩展前的区域,如果对象密度降低,则恢复扩展前的区域;直到对象密度达到最大,确定该区域为划定的一个区域。将已确定的区域从目标图像中分割出去,再在剩余部分重复上述过程,最终完成区域划分。
(2)适用于监控图像,摄像头所拍摄的场景区域不变的情况。从监控图像中调取一定数量的有代表性的历史图像,例如在过去一周的监控图像内,选取每天下午两点到三点之间(景区内游客高峰时段)的若干帧图像,将图像划分为若干细小的方格,计算每个方格的游客出现概率(该方格内出现游客的历史图像数量/所选取的历史图像总数),得到概率图,根据概率分布情况,将概率相近的方格连接为一个区域,从而将图像划分为多个区域。之后所拍摄的监控图像都采用该区域划分的结果。
将目标区域划分为多个区域后,对每个区域图像执行图1的方法,其中,对于各区域而言,所采用的预设阈值可以相同,也可以不同,即各区域可以具有统一的预设阈值,也可以分别具有对应的预设阈值。例如:在图5中,可以为区域二和三设置较小的预设阈值,区域四设置较大的预设阈值。各区域的预设阈值可以根据经验确定,也可以根据图像特征计算得到,例如:计算各区域内可能出现游客的部分图像面积,除以每个游客所占的图像面积,估计各区域内被游客填满且不存在遮挡时游客的数量,可以将该数量作为预设阈值,或者在该数量上乘以一个小于1的经验系数(如0.9等)作为预设阈值等,本公开对此不做特别限定。对每个区域采用针对性的预设阈值,可以更加准确的得到目标图像中对象的总数。
图6示出了本示例性实施例的另一种流程,包括:步骤S601,获取目标图像,例如可以是监控图像;步骤S602,对目标图像划分多个区域;步骤S603,以每个区域的图像为待处理图像,分别执行步骤S604~S608:步骤S604,通过对象识别,检测待处理图像中的对象数量,为第一数值;步骤S605,判断第一数值和预设阈值的大小;步骤S606,如果第一数值小于预设阈值,则确定该区域内的对象数量为第一数值;步骤S607,如果第一数值大于预设阈值,则第一数值不可信,还需对待处理图像进行对象密度检测,得到第二数值;步骤S608,确定该区域内的对象数量为第二数值;基于上述过程,可以得到每个区域的对象数量,最终执行步骤S609,累加各区域的对象数量,得到目标图像中的对象总数,从而最终确定了目标图像中的对象数量。
本公开的示例性实施例还提供了一种对象数量确定装置,如图7所示,该装置700可以包括:识别模块710,用于对待处理图像中的对象进行识别,将识别到的对象的数 量作为第一数值;比较模块720,用于比较第一数值和预设阈值;第一确定模块730,用于如果第一数值小于预设阈值,则将待处理图像中对象的数量确定为第一数值;第二确定模块740,用于如果第一数值大于预设阈值,则对待处理图像中的对象进行密度检测,得到关于对象数量的第二数值,并将待处理图像中对象的数量确定为第二数值。
在一示例性实施例中,对象数量确定装置700还可以包括:获取模块(图中未示出),用于获取目标图像,将目标图像划分为多个区域,分别以各区域的图像作为待处理图像。
在一示例性实施例中,上述各区域具有对应的预设阈值。
在一示例性实施例中,识别模块710可以用于通过预先训练的第一神经网络模型对待处理图像中的对象进行识别。
在一示例性实施例中,第一神经网络模型可以是YOLO模型。
在一示例性实施例中,第二确定模块740可以包括:密度检测单元(图中未示出),用于通过预先训练的第二神经网络模型对待处理图像中的对象进行密度检测。
在一示例性实施例中,第二神经网络模型可以包括:第一分支网络,用于对待处理图像进行第一卷积处理,得到第一特征图像;第二分支网络,用于对待处理图像进行第二卷积处理,得到第二特征图像;第三分支网络,用于对待处理图像进行第三卷积处理,得到第三特征图像;合并层,用于将第一特征图像、第二特征图像和第三特征图像合并为最终特征图像;输出层,用于将最终特征图像映射为密度图像。
上述装置中未披露的方案细节内容可以参见方法部分的实施例内容,因而不再赘述。
所属技术领域的技术人员能够理解,本公开的各个方面可以实现为***、方法或程序产品。因此,本公开的各个方面可以具体实现为以下形式,即:完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等),或硬件和软件方面结合的实施方式,这里可以统称为“电路”、“模块”或“***”。
本公开的示例性实施例还提供了一种计算机可读存储介质,其上存储有能够实现本说明书上述方法的程序产品。在一些可能的实施方式中,本公开的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当程序产品在终端设备上运行时,程序代码用于使终端设备执行本说明书上述“示例性方法”部分中描述的根据本公开各种示例性实施方式的步骤。
参考图8所示,描述了根据本公开的示例性实施例的用于实现上述方法的程序产品800,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本公开的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行***、装置或者器件使用或者与其结合使用。
程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号 介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行***、装置或者器件使用或者与其结合使用的程序。
可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言的任意组合来编写用于执行本公开操作的程序代码,程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。
本公开的示例性实施例还提供了一种能够实现上述方法的电子设备。下面参照图9来描述根据本公开的这种示例性实施例的电子设备900。图9显示的电子设备900仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图9所示,电子设备900可以以通用计算设备的形式表现。电子设备900的组件可以包括但不限于:上述至少一个处理单元910、上述至少一个存储单元920、连接不同***组件(包括存储单元920和处理单元910)的总线930和显示单元940。
存储单元920存储有程序代码,程序代码可以被处理单元910执行,使得处理单元910执行本说明书上述“示例性方法”部分中描述的根据本公开各种示例性实施方式的步骤。例如,处理单元910可以执行图4或图5所示的方法步骤等。
存储单元920可以包括易失性存储单元形式的可读介质,例如随机存取存储单元(RAM)921和/或高速缓存存储单元922,还可以进一步包括只读存储单元(ROM)923。
存储单元920还可以包括具有一组(至少一个)程序模块925的程序/实用工具924,这样的程序模块925包括但不限于:操作***、一个或者多个应用程序、其它 程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。
总线930可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存储单元控制器、***总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。
电子设备900也可以与一个或多个外部设备1000(例如键盘、指向设备、蓝牙设备等)通信,还可与一个或者多个使得用户能与该电子设备900交互的设备通信,和/或与使得该电子设备900能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口950进行。并且,电子设备900还可以通过网络适配器960与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器960通过总线930与电子设备900的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备900使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID***、磁带驱动器以及数据备份存储***等。
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本公开实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、终端装置、或者网络设备等)执行根据本公开示例性实施例的方法。
此外,上述附图仅是根据本公开示例性实施例的方法所包括的处理的示意性说明,而不是限制目的。易于理解,上述附图所示的处理并不表明或限制这些处理的时间顺序。另外,也易于理解,这些处理可以是例如在多个模块中同步或异步执行的。
应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本公开的示例性实施例,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其他实施例。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限。

Claims (10)

  1. 一种对象数量确定方法,其特征在于,包括:
    对待处理图像中的对象进行识别,将识别到的所述对象的数量作为第一数值;
    比较所述第一数值和预设阈值;
    如果所述第一数值小于所述预设阈值,则将所述待处理图像中所述对象的数量确定为所述第一数值;
    如果所述第一数值大于所述预设阈值,则对所述待处理图像中的对象进行密度检测,得到关于所述对象数量的第二数值,并将所述待处理图像中所述对象的数量确定为所述第二数值。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    获取目标图像,将所述目标图像划分为多个区域,分别以各所述区域的图像作为所述待处理图像。
  3. 根据权利要求2所述的方法,其特征在于,各所述区域具有对应的预设阈值。
  4. 根据权利要求1所述的方法,其特征在于,所述对待处理图像中的对象进行识别,包括:
    通过预先训练的第一神经网络模型对所述待处理图像中的对象进行识别。
  5. 根据权利要求4所述的方法,其特征在于,所述第一神经网络模型包括YOLO模型。
  6. 根据权利要求1所述的方法,其特征在于,所述对所述待处理图像中的对象进行密度检测,包括:
    通过预先训练的第二神经网络模型对所述待处理图像中的对象进行密度检测。
  7. 根据权利要求6所述的方法,其特征在于,所述第二神经网络模型包括:
    第一分支网络,用于对所述待处理图像进行第一卷积处理,得到第一特征图像;
    第二分支网络,用于对所述待处理图像进行第二卷积处理,得到第二特征图像;
    第三分支网络,用于对所述待处理图像进行第三卷积处理,得到第三特征图像;
    合并层,用于将所述第一特征图像、第二特征图像和第三特征图像合并为最终特征图像;
    输出层,用于将所述最终特征图像映射为密度图像。
  8. 一种对象数量确定装置,其特征在于,包括:
    识别模块,用于对待处理图像中的对象进行识别,将识别到的所述对象的数量作为第一数值;
    比较模块,用于比较所述第一数值和预设阈值;
    第一确定模块,用于如果所述第一数值小于所述预设阈值,则将所述待处理图像中所述对象的数量确定为所述第一数值;
    第二确定模块,用于如果所述第一数值大于所述预设阈值,则对所述待处理图像 中的对象进行密度检测,得到关于所述对象数量的第二数值,并将所述待处理图像中所述对象的数量确定为所述第二数值。
  9. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1-7任一项所述的方法。
  10. 一种电子设备,其特征在于,包括:
    处理器;以及
    存储器,用于存储所述处理器的可执行指令;
    其中,所述处理器配置为经由执行所述可执行指令来执行权利要求1-7任一项所述的方法。
PCT/CN2020/108677 2019-08-20 2020-08-12 对象数量确定方法、装置、存储介质与电子设备 WO2021031954A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910769944.8 2019-08-20
CN201910769944.8A CN110472599B (zh) 2019-08-20 2019-08-20 对象数量确定方法、装置、存储介质与电子设备

Publications (1)

Publication Number Publication Date
WO2021031954A1 true WO2021031954A1 (zh) 2021-02-25

Family

ID=68512644

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/108677 WO2021031954A1 (zh) 2019-08-20 2020-08-12 对象数量确定方法、装置、存储介质与电子设备

Country Status (2)

Country Link
CN (1) CN110472599B (zh)
WO (1) WO2021031954A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112988823A (zh) * 2021-03-09 2021-06-18 北京百度网讯科技有限公司 对象处理方法、装置、设备及存储介质
CN113283499A (zh) * 2021-05-24 2021-08-20 南京航空航天大学 一种基于深度学习的三维编制物编制密度检测方法
CN113486732A (zh) * 2021-06-17 2021-10-08 普联国际有限公司 一种人群密度估计方法、装置、设备及存储介质
CN113807260A (zh) * 2021-09-17 2021-12-17 北京百度网讯科技有限公司 数据处理方法、装置、电子设备和存储介质
CN114785943A (zh) * 2022-03-31 2022-07-22 联想(北京)有限公司 一种数据确定方法、设备和计算机可读存储介质
CN115384796A (zh) * 2022-04-01 2022-11-25 中国民用航空飞行学院 一种能够增加旅客转运效率的机场管理***

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110472599B (zh) * 2019-08-20 2021-09-03 北京海益同展信息科技有限公司 对象数量确定方法、装置、存储介质与电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130063282A1 (en) * 2010-05-31 2013-03-14 Central Signal, Llc Roadway detection
CN106845344A (zh) * 2016-12-15 2017-06-13 重庆凯泽科技股份有限公司 人群统计方法及装置
CN108399388A (zh) * 2018-02-28 2018-08-14 福州大学 一种中高密度人群数量统计方法
CN110472599A (zh) * 2019-08-20 2019-11-19 北京海益同展信息科技有限公司 对象数量确定方法、装置、存储介质与电子设备

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1089214A3 (en) * 1999-09-30 2005-01-26 Matsushita Electric Industrial Co., Ltd. Apparatus and method for image recognition
CN101320427A (zh) * 2008-07-01 2008-12-10 北京中星微电子有限公司 一种具有对目标的辅助监测功能的视频监控方法及***
CN102831613B (zh) * 2012-08-29 2014-11-19 武汉大学 一种并行的分形网络演化影像分割方法
CN107093171B (zh) * 2016-02-18 2021-04-30 腾讯科技(深圳)有限公司 一种图像处理方法及装置、***
CN108009477B (zh) * 2017-11-10 2020-08-21 东软集团股份有限公司 图像的人流数量检测方法、装置、存储介质及电子设备
CN110008783A (zh) * 2018-01-04 2019-07-12 杭州海康威视数字技术股份有限公司 基于神经网络模型的人脸活体检测方法、装置及电子设备
CN108875587A (zh) * 2018-05-24 2018-11-23 北京飞搜科技有限公司 目标分布检测方法及设备
CN109224442B (zh) * 2018-09-03 2021-06-11 腾讯科技(深圳)有限公司 虚拟场景的数据处理方法、装置及存储介质
CN109389589A (zh) * 2018-09-28 2019-02-26 百度在线网络技术(北京)有限公司 用于统计人数的方法和装置
CN109815868B (zh) * 2019-01-15 2022-02-01 腾讯科技(深圳)有限公司 一种图像目标检测方法、装置及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130063282A1 (en) * 2010-05-31 2013-03-14 Central Signal, Llc Roadway detection
CN106845344A (zh) * 2016-12-15 2017-06-13 重庆凯泽科技股份有限公司 人群统计方法及装置
CN108399388A (zh) * 2018-02-28 2018-08-14 福州大学 一种中高密度人群数量统计方法
CN110472599A (zh) * 2019-08-20 2019-11-19 北京海益同展信息科技有限公司 对象数量确定方法、装置、存储介质与电子设备

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112988823A (zh) * 2021-03-09 2021-06-18 北京百度网讯科技有限公司 对象处理方法、装置、设备及存储介质
CN112988823B (zh) * 2021-03-09 2024-06-04 北京百度网讯科技有限公司 对象处理方法、装置、设备及存储介质
CN113283499A (zh) * 2021-05-24 2021-08-20 南京航空航天大学 一种基于深度学习的三维编制物编制密度检测方法
CN113486732A (zh) * 2021-06-17 2021-10-08 普联国际有限公司 一种人群密度估计方法、装置、设备及存储介质
CN113807260A (zh) * 2021-09-17 2021-12-17 北京百度网讯科技有限公司 数据处理方法、装置、电子设备和存储介质
CN113807260B (zh) * 2021-09-17 2022-07-12 北京百度网讯科技有限公司 数据处理方法、装置、电子设备和存储介质
CN114785943A (zh) * 2022-03-31 2022-07-22 联想(北京)有限公司 一种数据确定方法、设备和计算机可读存储介质
CN114785943B (zh) * 2022-03-31 2024-03-05 联想(北京)有限公司 一种数据确定方法、设备和计算机可读存储介质
CN115384796A (zh) * 2022-04-01 2022-11-25 中国民用航空飞行学院 一种能够增加旅客转运效率的机场管理***

Also Published As

Publication number Publication date
CN110472599A (zh) 2019-11-19
CN110472599B (zh) 2021-09-03

Similar Documents

Publication Publication Date Title
WO2021031954A1 (zh) 对象数量确定方法、装置、存储介质与电子设备
US10735694B2 (en) System and method for activity monitoring using video data
CN108256404B (zh) 行人检测方法和装置
CN105051754B (zh) 用于通过监控***检测人的方法和装置
WO2021051601A1 (zh) 利用Mask R-CNN选择检测框的方法及***、电子装置及存储介质
CN110348522B (zh) 一种图像检测识别方法及***、电子设备、图像分类网络优化方法及***
CN110781836A (zh) 人体识别方法、装置、计算机设备及存储介质
WO2022041830A1 (zh) 行人重识别方法和装置
WO2020094088A1 (zh) 一种图像抓拍方法、监控相机及监控***
TW202026948A (zh) 活體檢測方法、裝置以及儲存介質
CN113205037B (zh) 事件检测的方法、装置、电子设备以及可读存储介质
CN112861575A (zh) 一种行人结构化方法、装置、设备和存储介质
JP7273129B2 (ja) 車線検出方法、装置、電子機器、記憶媒体及び車両
WO2023273041A1 (zh) 车路协同中目标检测方法、装置和路侧设备
US20150104067A1 (en) Method and apparatus for tracking object, and method for selecting tracking feature
AU2018379393A1 (en) Monitoring systems, and computer implemented methods for processing data in monitoring systems, programmed to enable identification and tracking of human targets in crowded environments
CN113780270B (zh) 目标检测方法和装置
WO2022199360A1 (zh) 运动物体的定位方法、装置、电子设备及存储介质
CN104376577A (zh) 基于粒子滤波的多摄像头多目标跟踪算法
CN113807361B (zh) 神经网络、目标检测方法、神经网络训练方法及相关产品
WO2021022698A1 (zh) 尾随检测方法、装置、电子设备及存储介质
CN112634368A (zh) 场景目标的空间与或图模型生成方法、装置及电子设备
CN114663871A (zh) 图像识别方法、训练方法、装置、***及存储介质
CN112541403A (zh) 一种利用红外摄像头的室内人员跌倒检测方法
CN114613006A (zh) 一种远距离手势识别方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20854419

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20854419

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20854419

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.09.2022)