WO2017117801A1 - 边界框生成装置及方法 - Google Patents

边界框生成装置及方法 Download PDF

Info

Publication number
WO2017117801A1
WO2017117801A1 PCT/CN2016/070479 CN2016070479W WO2017117801A1 WO 2017117801 A1 WO2017117801 A1 WO 2017117801A1 CN 2016070479 W CN2016070479 W CN 2016070479W WO 2017117801 A1 WO2017117801 A1 WO 2017117801A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion information
bounding box
image
video image
target object
Prior art date
Application number
PCT/CN2016/070479
Other languages
English (en)
French (fr)
Inventor
伍健荣
刘晓青
Original Assignee
富士通株式会社
伍健荣
刘晓青
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 富士通株式会社, 伍健荣, 刘晓青 filed Critical 富士通株式会社
Priority to PCT/CN2016/070479 priority Critical patent/WO2017117801A1/zh
Publication of WO2017117801A1 publication Critical patent/WO2017117801A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion

Definitions

  • the present invention relates to the field of information technology, and in particular, to a bounding box generating apparatus and method.
  • a method of generating a bounding box for a target object generally includes a plurality of pre-processing steps and a step of processing the pre-processed result using a predetermined strategy to obtain a bounding box of the target object.
  • Embodiments of the present invention provide a bounding box generating apparatus and method, which generate a bounding box of a target object based on motion information in a video image, which can effectively reduce computational complexity and obtain high precision.
  • a bounding box generating apparatus comprising: a detecting unit, configured to perform motion detection on a video image, obtain a motion information image of the video image; generate And a generating unit, configured to generate a bounding box of the target object in the video image according to the motion information in the motion information image.
  • an electronic device comprising: the bounding box generating apparatus according to the first aspect of the embodiments of the present invention.
  • a method for generating a bounding box includes: performing motion detection on a video image to obtain a motion information image of the video image; and generating a location according to motion information in the motion information image The bounding box of the target object in the video image.
  • the invention has the beneficial effects that the boundary frame of the target object is generated based on the motion information in the video image, which can effectively reduce the computational complexity and can obtain higher precision.
  • FIG. 1 is a schematic diagram of a bounding box generating apparatus according to Embodiment 1 of the present invention.
  • FIG. 2 is a schematic diagram of obtaining a motion information image after performing motion detection on a video image according to Embodiment 1 of the present invention
  • FIG. 3 is a schematic diagram of a generating unit 102 according to Embodiment 1 of the present invention.
  • FIG. 4 is a schematic diagram of selecting a bounding box of a target object according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a first selection unit 302 according to Embodiment 1 of the present invention.
  • FIG. 6 is another schematic diagram of the generating unit 102 according to Embodiment 1 of the present invention.
  • FIG. 7 is a schematic diagram of extracting a sub-image and generating a bounding box for a sub-image according to Embodiment 1 of the present invention.
  • FIG. 8 is still another schematic diagram of the generating unit 102 according to Embodiment 1 of the present invention.
  • FIG. 9 is still another schematic diagram of the generating unit 102 according to Embodiment 1 of the present invention.
  • FIG. 10 is a schematic diagram of selecting a contour in a contour image according to Embodiment 1 of the present invention.
  • FIG. 11 is a schematic diagram of an electronic device according to Embodiment 2 of the present invention.
  • Figure 12 is a schematic block diagram showing the system configuration of an electronic device according to Embodiment 2 of the present invention.
  • FIG. 13 is a flowchart of a method for generating a bounding box according to Embodiment 3 of the present invention.
  • FIG. 14 is a flowchart of a method for generating a bounding box according to Embodiment 4 of the present invention.
  • Figure 16 is a flow chart showing a method of generating a bounding box according to a sixth embodiment of the present invention.
  • Fig. 1 is a schematic diagram of a bounding box generating apparatus according to a first embodiment of the present invention. As shown in FIG. 1, the device 100 includes:
  • the detecting unit 101 is configured to perform motion detection on the video image to obtain a motion information image of the video image;
  • the generating unit 102 is configured to generate a bounding box of the target object in the video image according to the motion information in the motion information image.
  • the boundary frame of the target object is generated based on the motion information in the video image, which can effectively reduce the computational complexity and can obtain higher precision.
  • the video image can be obtained by using an existing method, for example, by photographing an electronic device such as a camera, a digital camera, or a digital camera;
  • the video image may include a plurality of frames of images that are consecutive in time, for example, the video image includes a current frame and a plurality of previous frames.
  • the required number of image frames can be set according to the requirements of the motion detection, for example, the number of frames is more than 2 frames.
  • the embodiment of the present invention does not limit the number of frames of a video image.
  • the method in which the detecting unit 101 performs motion detection on the video image to obtain the motion information image may use an existing method, for example, differentiating the plurality of frames of the video image one by one and performing binary value based on the result of the difference Processing to obtain a motion information image expressed in binarized form.
  • FIG. 2 is a schematic diagram of obtaining a motion information image after performing motion detection on a video image according to Embodiment 1 of the present invention. As shown in FIG. 2, after the video image 201 having the multi-frame image is detected by the motion of the detecting unit 101, the binarized motion information image 202 is obtained.
  • the generating unit 102 after obtaining the motion information image of the video image, the generating unit 102 generates a bounding box of the target object in the video image according to the motion information in the motion information image.
  • the bounding box of the target object in the video image may be the last frame of the video image, that is, the bounding box of the target object in the current frame.
  • the target object refers to an object to be detected, for example, a moving person or a vehicle or the like.
  • the structure of the generating unit 102 of the present embodiment and the method of generating a bounding box based on the motion information will be exemplarily described below.
  • FIG. 3 is a schematic diagram of a generating unit 102 according to Embodiment 1 of the present invention. As shown in FIG. 3, the generating unit 102 includes:
  • a first generating unit 301 configured to generate a bounding box of the target object in the video image
  • the first selecting unit 302 is configured to select the generated bounding box according to the motion information in the motion information image.
  • the first generating unit 301 can generate a bounding box of the target object in the video image by using an existing method, for example, first performing multiple preprocessing, and calculating features such as edges, color blocks, contours, saliency, and the like. Then, the calculated features are processed using a preset strategy to obtain a bounding box of the target object.
  • the first selection unit 302 selects the generated bounding box according to the motion information in the motion information image.
  • 4 is a schematic diagram of selecting a bounding box of a target object according to an embodiment of the present invention. As shown in FIG. 4, the first selection unit 302 determines the bounding box generated by the first generating unit 301 in the video image 402 based on the previously obtained motion information image 401 (for example, the binarized motion information image 202 in FIG. 2). A selection is made to obtain a video image 403 having a selected bounding box.
  • the structure of the first selection unit and the method of selecting the bounding box of the present embodiment will be exemplarily described below.
  • FIG. 5 is a schematic diagram of the first selection unit 302 of Embodiment 1 of the present invention. As shown in FIG. 5, the first selection unit 302 includes:
  • a second selecting unit 501 configured to perform a selection of a bounding box according to the amount of motion information in each bounding box;
  • the third selecting unit 502 is configured to perform selection of a bounding box according to the running information occupancy rate in each bounding box.
  • the second selection unit 501 performs the bounding box according to the amount of motion information in each bounding box.
  • a bounding box having a motion information amount greater than a first threshold may be used as a selected bounding box, wherein the motion information amount may be represented by a parameter or a number of pixels having motion information, the first threshold may be It is set according to actual needs, for example, the first threshold is a value greater than or equal to 100.
  • the third selection unit 502 performs the selection of the bounding box according to the operating information occupancy rate in each of the bounding boxes.
  • a bounding box whose motion information occupancy rate is greater than the second threshold may be used as the selected bounding box.
  • the motion information occupancy rate may be represented by, for example, a ratio of the number of pixels or the area of the motion information to the ratio of the number of pixels or the bounding box area included in the bounding box, and the second threshold may be set according to actual needs.
  • the second threshold is a value of 0.5 to 1.
  • FIG. 6 is another schematic diagram of the generating unit 102 according to Embodiment 1 of the present invention. As shown in FIG. 6, the generating unit 102 includes:
  • the extracting unit 601 is configured to extract, according to the motion information in the motion information image, a sub-image having the target object in the video image;
  • the second generating unit 602 is configured to generate a bounding box of the target object in the video image for the sub-image having the target object.
  • the extracting unit 601 can extract a sub-image having a target object in the video image using an existing method. For example, object detection is performed on the obtained motion information image, an area where the moving object is located is detected, and a corresponding area is extracted from the current frame of the video image as a sub-image.
  • the second generating unit 602 generates a bounding box for the extracted sub-images, and may use an existing method, for example, for each sub-image, respectively, pre-processing and calculating the pre-processing using a preset strategy. Features are processed.
  • FIG. 7 is a schematic diagram of extracting a sub-image and generating a bounding box for a sub-image according to Embodiment 1 of the present invention.
  • the extracting unit 601 extracts the sub-image 703 from the current frame 702 of the video image according to the motion information image 701 (for example, the binarized motion information image 202 in FIG. 2) in which the region where the moving object is located is detected.
  • the second generation unit 602 generates a bounding box of the target object for the sub-images 703, 704, 705 and displays it on the output video image 706.
  • FIG. 8 is still another schematic diagram of the generating unit 102 according to Embodiment 1 of the present invention. As shown in FIG. 8, the generating unit 102 includes:
  • a pre-processing unit 801 configured to select a feature of the video image according to motion information in the motion information image for pre-processing
  • the third generating unit 802 is configured to process the pre-processed feature according to a preset policy to generate a bounding box of the target object in the video image.
  • the generating unit 102 may include a plurality of pre-processing units, each of which calculates a different feature, wherein the at least one pre-processing unit selects the feature of the video image according to the motion information in the motion information image. Pretreatment. That is, the generating unit 102 includes at least one pre-processing unit 801, and may also include other existing pre-processing units.
  • FIG. 9 is still another schematic diagram of the generating unit 102 according to Embodiment 1 of the present invention.
  • the generating unit 102 includes: N pre-processing units 901-1, 901-2, ..., 901-N and a third generating unit 902, at least one of the N pre-processing units and the pre-processing unit 801
  • the other pre-processing unit is an existing pre-processing unit
  • N is a positive integer
  • the third generation unit 902 has the same structure and function as the third generation unit 802.
  • the features calculated by the pre-processing unit may include: contour, color similarity, color saliency, texture, and the like.
  • the pre-processing unit 801 selects the feature of the video image according to the motion information in the motion information image for pre-processing, that is, calculates the selected feature. For example, the pre-processing unit 801 selects a partial contour in the contour image of the current frame of the video image according to the moving object in the motion information image to perform pre-processing, and calculates the contour feature.
  • Figure 10 is a diagram showing the selection of contours in a contour image according to Embodiment 1 of the present invention.
  • the pre-processing unit 801 selects the contour in the contour image 1002 based on the moving object in the motion information image 1001 (for example, the binarized motion information image 202 in FIG. 2), and obtains the selected contour.
  • Image 1003 pre-processing unit 801 pre-processes contour image 1003 to calculate contour features therein.
  • the third generating unit 802 processes the pre-processed feature using a predetermined policy to generate a bounding box of the target object in the video image, wherein the predetermined policy may use an existing policy, for example, , selective search, etc.
  • the boundary frame of the target object is generated based on the motion information in the video image, which can effectively reduce the computational complexity and can obtain higher precision.
  • FIG. 11 is a schematic diagram of an electronic device according to Embodiment 2 of the present invention.
  • the electronic device 1100 includes a bounding box generating device 1101, wherein the bounding box is generated
  • the structure and function of the setting 1101 are the same as those in the first embodiment, and will not be described herein.
  • the electronic device is, for example, a device having an image capturing function such as a camera, a digital camera, a digital video camera, a smart phone, or the like.
  • Figure 12 is a schematic block diagram showing the system configuration of an electronic device according to a second embodiment of the present invention.
  • electronic device 1200 can include central processor 1201 and memory 1202; memory 1202 is coupled to central processor 1201.
  • the figure is exemplary; other types of structures may be used in addition to or in place of the structure to implement telecommunications functions or other functions.
  • the electronic device 1200 may further include: an input unit 1203, a display 1204, and a power source 1205.
  • the functions of the bounding box generating apparatus described in Embodiment 1 may be integrated into the central processing unit 1201.
  • the central processing unit 1201 may be configured to: perform motion detection on the video image to obtain a motion information image of the video image; and generate a boundary of the target object in the video image according to the motion information in the motion information image. frame.
  • the selecting, according to the motion information in the motion information image, the generated bounding box includes: selecting a bounding box according to the amount of motion information in each bounding box; or according to running information in each bounding box Occupancy rate, the choice of the bounding box.
  • the performing the selection of the bounding box according to the amount of motion information in each of the bounding boxes includes: using a bounding box whose amount of motion information is greater than or equal to the first threshold as a selected bounding box;
  • the operating information occupancy rate is selected by the bounding box, including: using a bounding box whose motion information occupancy rate is greater than or equal to the second threshold value as the selected bounding box.
  • the bounding box generating apparatus described in Embodiment 1 may be configured separately from the central processing unit 1201.
  • the bounding box generating apparatus may be configured as a chip connected to the central processing unit 1201 through the central processing unit 1201. Control to implement the function of the bounding box generating device.
  • the electronic device 1200 also does not have to include all of the components shown in FIG. 12 in this embodiment.
  • central processor 1201 also sometimes referred to as a controller or operational control, may include a microprocessor or other processor device and/or logic device, and central processor 1201 receives input and controls various components of electronic device 1200. Operation.
  • Memory 1202 can be one or more of a buffer, a flash memory, a hard drive, a removable medium, a volatile memory, a non-volatile memory, or other suitable device.
  • the central processing unit 1201 can execute the program stored by the memory 1202 to implement information storage or processing and the like.
  • the functions of other components are similar to those of the existing ones and will not be described here.
  • the various components of electronic device 1200 may be implemented by special purpose hardware, firmware, software, or a combination thereof without departing from the scope of the invention.
  • the boundary frame of the target object is generated based on the motion information in the video image, which can effectively reduce the computational complexity and can obtain higher precision.
  • the embodiment of the invention further provides a method for generating a bounding box, which corresponds to the bounding box generating device of the first embodiment.
  • Figure 13 is a flow chart showing a method of generating a bounding box according to a third embodiment of the present invention. As shown in FIG. 13, the method includes:
  • Step 1301 Perform motion detection on the video image to obtain a motion information image of the video image.
  • Step 1302 Generate a bounding box of the target object in the video image according to the motion information in the motion information image.
  • the method of performing motion detection on a video image and the method of generating a bounding box according to motion information are the same as those in Embodiment 1, and details are not described herein again.
  • the boundary frame of the target object is generated based on the motion information in the video image, which can effectively reduce the computational complexity and can obtain higher precision.
  • the embodiment of the invention further provides a method for generating a bounding box, which corresponds to the bounding box generating device of the first embodiment.
  • Figure 14 is a flow chart showing a method of generating a bounding box according to a fourth embodiment of the present invention. As shown in FIG. 14, the method includes:
  • Step 1401 Perform motion detection on the input video image to obtain a motion information image of the video image.
  • Step 1402 Generate a bounding box of the target object in the video image
  • Step 1403 Select the generated bounding box according to the motion information in the motion information image.
  • the method of performing motion detection on a video image, the method of generating a bounding box, and the method of selecting a bounding box according to motion information are the same as those in Embodiment 1, and details are not described herein again.
  • the boundary frame of the target object is generated based on the motion information in the video image, which can effectively reduce the computational complexity and can obtain higher precision.
  • the embodiment of the invention further provides a method for generating a bounding box, which corresponds to the bounding box generating device of the first embodiment.
  • Figure 15 is a flow chart showing a method of generating a bounding box according to a fifth embodiment of the present invention. As shown in Figure 15, the method includes:
  • Step 1501 Perform motion detection on the input video image to obtain a motion information image of the video image.
  • Step 1502 Extract, according to motion information in the motion information image, a sub-image having a target object in the video image;
  • Step 1503 Generate a bounding box of the target object in the video image for the sub-image having the target object.
  • the method of performing motion detection on a video image, the method of extracting a sub-image, and the method of generating a bounding box for a sub-image are the same as those in Embodiment 1, and are not described herein again.
  • the boundary frame of the target object is generated based on the motion information in the video image, which can effectively reduce the computational complexity and can obtain higher precision.
  • the embodiment of the invention further provides a method for generating a bounding box, which corresponds to the bounding box generating device of the first embodiment.
  • Figure 16 is a flow chart showing a method of generating a bounding box according to a sixth embodiment of the present invention. As shown in FIG. 16, the method includes:
  • Step 1601 Perform motion detection on the input video image to obtain a motion information image of the video image.
  • Step 1602 Select a feature of the video image according to motion information in the motion information image for preprocessing
  • Step 1603 Process the pre-processed feature according to a preset policy to generate a bounding box of the target object in the video image.
  • a method for performing motion detection on a video image a method for selecting features of the video image, a method for pre-processing the selected feature, and performing pre-processed features according to a preset strategy
  • the method is the same as that described in Embodiment 1, and will not be described again here.
  • the boundary frame of the target object is generated based on the motion information in the video image, which can effectively reduce the computational complexity and can obtain higher precision.
  • An embodiment of the present invention further provides a computer readable program, wherein when the program is executed in a bounding box generating device or an electronic device, the program causes a computer to execute Embodiment 3 in the bounding box generating device or an electronic device.
  • the embodiment of the present invention further provides a storage medium storing a computer readable program, wherein the computer readable program causes the computer to execute any one of Embodiments 3 to 6 in a bounding box generating device or an electronic device.
  • the above apparatus and method of the present invention may be implemented by hardware or by hardware in combination with software.
  • the present invention relates to a computer readable program that, when executed by a logic component, enables the logic component to implement the apparatus or components described above, or to cause the logic component to implement the various methods described above Or steps.
  • the present invention also relates to a storage medium for storing the above program, such as a hard disk, a magnetic disk, an optical disk, a DVD, a flash memory, or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

一种边界框生成装置及方法。该装置包括:检测单元,用于对视频图像进行运动检测,获得所述视频图像的运动信息图像;生成单元,用于根据所述运动信息图像中的运动信息,生成所述视频图像中的目标物体的边界框。基于视频图像中的运动信息来生成目标物体的边界框,能够有效降低计算复杂度且能够获得较高的精度。

Description

边界框生成装置及方法 技术领域
本发明涉及信息技术领域,尤其涉及一种边界框生成装置及方法。
背景技术
随着信息技术的不断发展,计算机视觉以及智能交通***的应用逐渐广泛。基于这些应用的要求,需要对获得的视频图像进行各种处理,例如,对视频图像中的目标物体生成边界框,以确定目标物体的位置,并且,在多个目标物体粘连在一起的情况下,能够将各个目标物体划分出来。
目前,对目标物体生成边界框的方法一般包括多个预处理步骤以及使用预先设定的策略对预处理的结果进行处理以获得目标物体的边界框的步骤。
应该注意,上面对技术背景的介绍只是为了方便对本发明的技术方案进行清楚、完整的说明,并方便本领域技术人员的理解而阐述的。不能仅仅因为这些方案在本发明的背景技术部分进行了阐述而认为上述技术方案为本领域技术人员所公知。
发明内容
当利用上述现有的方法生成目标物体的边界框时,计算复杂度较高,且精度较低。
本发明实施例提供一种边界框生成装置及方法,基于视频图像中的运动信息来生成目标物体的边界框,能够有效降低计算复杂度且能够获得较高的精度。
根据本发明实施例的第一方面,提供一种边界框生成装置,所述装置包括:检测单元,所述检测单元用于对视频图像进行运动检测,获得所述视频图像的运动信息图像;生成单元,所述生成单元用于根据所述运动信息图像中的运动信息,生成所述视频图像中的目标物体的边界框。
根据本发明实施例的第二方面,提供一种电子设备,包括:根据本发明实施例的第一方面所述的边界框生成装置。
根据本发明实施例的第三方面,提供一种边界框生成方法,包括:对视频图像进行运动检测,获得所述视频图像的运动信息图像;根据所述运动信息图像中的运动信息,生成所述视频图像中的目标物体的边界框。
本发明的有益效果在于:基于视频图像中的运动信息来生成目标物体的边界框,能够有效降低计算复杂度且能够获得较高的精度。
参照后文的说明和附图,详细公开了本发明的特定实施方式,指明了本发明的原理可以被采用的方式。应该理解,本发明的实施方式在范围上并不因而受到限制。在所附权利要求的精神和条款的范围内,本发明的实施方式包括许多改变、修改和等同。
针对一种实施方式描述和/或示出的特征可以以相同或类似的方式在一个或更多个其它实施方式中使用,与其它实施方式中的特征相组合,或替代其它实施方式中的特征。
应该强调,术语“包括/包含”在本文使用时指特征、整件、步骤或组件的存在,但并不排除一个或更多个其它特征、整件、步骤或组件的存在或附加。
附图说明
所包括的附图用来提供对本发明实施例的进一步的理解,其构成了说明书的一部分,用于例示本发明的实施方式,并与文字描述一起来阐释本发明的原理。显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。在附图中:
图1是本发明实施例1的边界框生成装置的示意图;
图2是本发明实施例1的对视频图像进行运动检测后获得运动信息图像的示意图;
图3是本发明实施例1的生成单元102的一示意图;
图4是本发明实施例的对目标物体的边界框进行选择的示意图;
图5是本发明实施例1的第一选择单元302的示意图;
图6是本发明实施例1的生成单元102的另一示意图;
图7是本发明实施例1的提取子图像并针对子图像生成边界框的示意图;
图8是本发明实施例1的生成单元102的又一示意图;
图9是本发明实施例1的生成单元102的又一示意图;
图10是本发明实施例1的对轮廓图像中的轮廓进行选择的示意图;
图11是本发明实施例2的电子设备的示意图;
图12是本发明实施例2的电子设备的***构成的一示意框图;
图13是本发明实施例3的边界框生成方法的流程图;
图14是本发明实施例4的边界框生成方法的流程图;
图15是本发明实施例5的边界框生成方法的流程图;
图16是本发明实施例6的边界框生成方法的流程图。
具体实施方式
参照附图,通过下面的说明书,本发明的前述以及其它特征将变得明显。在说明书和附图中,具体公开了本发明的特定实施方式,其表明了其中可以采用本发明的原则的部分实施方式,应了解的是,本发明不限于所描述的实施方式,相反,本发明包括落入所附权利要求的范围内的全部修改、变型以及等同物。
实施例1
图1是本发明实施例1的边界框生成装置的示意图。如图1所示,该装置100包括:
检测单元101,用于对视频图像进行运动检测,获得该视频图像的运动信息图像;
生成单元102,用于根据该运动信息图像中的运动信息,生成该视频图像中的目标物体的边界框。
由上述实施例可知,基于视频图像中的运动信息来生成目标物体的边界框,能够有效降低计算复杂度且能够获得较高的精度。
在本实施例中,该视频图像可使用现有方法而获得,例如,通过摄像头、数码相机、数字摄像机等电子设备的拍摄而获得;
在本实施例中,该视频图像可包括在时间上连续的多帧图像,例如,该视频图像包括当前帧以及之前的多个帧。其中,可根据运动检测的要求来设置需要的图像帧数,例如,该帧数为2帧以上。本发明实施例不对视频图像的帧数进行限制。
在本实施例中,检测单元101对视频图像进行运动检测以获得运动信息图像的方法可使用现有方法,例如,对该视频图像具有的多个帧逐个进行差分并基于差分的结果进行二值化处理,从而获得以二值化形式表示的运动信息图像。
图2是本发明实施例1的对视频图像进行运动检测后获得运动信息图像的示意图。如图2所示,具有多帧图像的视频图像201经过检测单元101的运动检测后,获得二值化的运动信息图像202。
在本实施例中,在获得视频图像的运动信息图像之后,生成单元102根据该运动信息图像中的运动信息,生成该视频图像中的目标物体的边界框。其中,该视频图像中的目标物体的边界框可以是该视频图像的最后一帧,即当前帧中目标物体的边界框。
在本实施例中,目标物体是指需要检测的物体,例如,运动的人或车辆等。
以下对本实施例的生成单元102的结构以及根据运动信息生成边界框的方法进行示例性的说明。
图3是本发明实施例1的生成单元102的一示意图。如图3所示,生成单元102包括:
第一生成单元301,用于生成该视频图像中的目标物体的边界框;
第一选择单元302,用于根据该运动信息图像中的运动信息,对生成的边界框进行选择。
在本实施例中,第一生成单元301可使用现有方法生成该视频图像中的目标物体的边界框,例如,首先进行多个预处理,计算例如边缘、颜色块、轮廓、显著性等特征,然后使用预先设定的策略对计算出的特征进行处理以获得目标物体的边界框。
在本实施例中,第一选择单元302根据该运动信息图像中的运动信息,对生成的边界框进行选择。图4是本发明实施例的对目标物体的边界框进行选择的示意图。如图4所示,第一选择单元302根据之前获得的运动信息图像401(例如,图2中的二值化运动信息图像202),在视频图像402中对第一生成单元301生成的边界框进行选择,获得具有选择后的边界框的视频图像403。
以下对本实施例的第一选择单元的结构以及选择边界框的方法进行示例性的说明。
图5是本发明实施例1的第一选择单元302的示意图。如图5所示,第一选择单元302包括:
第二选择单元501,用于根据各个边界框中的运动信息量,进行边界框的选择;或
第三选择单元502,用于根据各个边界框中的运行信息占有率,进行边界框的选择。
在本实施例中,第二选择单元501根据各个边界框中的运动信息量进行边界框的 选择,例如,可以将具有的运动信息量大于第一阈值的边界框作为选择的边界框,其中,该运动信息量可以用具有运动信息的像素个数或面积等参数表示,该第一阈值可根据实际需要而设定,例如,该第一阈值为大于或等于100的数值。
在本实施例中,第三选择单元502根据各个边界框中的运行信息占有率,进行边界框的选择,例如,可以将其运动信息占有率大于第二阈值的边界框作为选择的边界框,其中,该运动信息占有率例如可以用具有运动信息的像素个数或面积与该边界框包括的所有像素个数或边界框面积的比率来表示,该第二阈值可根据实际需要而设定,例如,该第二阈值为0.5~1的数值。
图6是本发明实施例1的生成单元102的另一示意图。如图6所示,生成单元102包括:
提取单元601,用于根据该运动信息图像中的运动信息,提取该视频图像中具有目标物体的子图像;
第二生成单元602,用于针对该具有目标物体的子图像,生成该视频图像中的目标物体的边界框。
在本实施例中,提取单元601可使用现有方法来提取该视频图像中具有目标物体的子图像。例如,对获得的运动信息图像进行物体检测,检测出运动物体所在的区域,并从视频图像的当前帧中提取出相应的区域作为子图像。
在本实施例中,第二生成单元602针对提取出的子图像生成边界框可使用现有方法,例如,对于每个子图像,分别进行预处理并使用预先设定的策略对预处理计算出的特征进行处理。
图7是本发明实施例1的提取子图像并针对子图像生成边界框的示意图。如图7所示,提取单元601根据检测出运动物体所在区域的运动信息图像701(例如,图2中的二值化运动信息图像202),从视频图像的当前帧702提取出子图像703、704、705,第二生成单元602针对子图像703、704、705,生成目标物体的边界框并显示在输出的视频图像706上。
图8是本发明实施例1的生成单元102的又一示意图。如图8所示,生成单元102包括:
预处理单元801,用于根据该运动信息图像中的运动信息选择该视频图像的特征进行预处理;
第三生成单元802,用于根据预先设定的策略对预处理后的特征进行处理,生成该视频图像中的目标物体的边界框。
在本实施例中,生成单元102可包括多个预处理单元,各个预处理单元分别计算不同的特征,其中,至少一个预处理单元根据该运动信息图像中的运动信息选择该视频图像的特征进行预处理。即,生成单元102包括至少一个预处理单元801,还可以包括其他现有的预处理单元。
图9是本发明实施例1的生成单元102的又一示意图。如图9所示,生成单元102包括:N个预处理单元901-1,901-2,……,901-N以及第三生成单元902,N个预处理单元中的至少一个与预处理单元801具有相同的结构与功能,其他预处理单元为现有的预处理单元,N为正整数,第三生成单元902与第三生成单元802具有相同的结构与功能。
在本实施例中,预处理单元计算的特征可以包括:轮廓、颜色相似性、颜色显著性、纹理等。
在本实施例中,预处理单元801根据该运动信息图像中的运动信息选择该视频图像的特征进行预处理,即,对选择后的特征进行计算。例如,预处理单元801根据运动信息图像中的运动物体,选择视频图像当前帧的轮廓图像中的部分轮廓进行预处理,计算轮廓特征。
图10是本发明实施例1的对轮廓图像中的轮廓进行选择的示意图。如图10所示,预处理单元801根据运动信息图像1001(例如,图2中的二值化运动信息图像202)中的运动物体,对轮廓图像1002中的轮廓进行选择,获得选择后的轮廓图像1003,然后预处理单元801对于轮廓图像1003进行预处理,计算其中的轮廓特征。
在本实施例中,第三生成单元802使用预定的策略对预处理后的特征进行处理,生成该视频图像中的目标物体的边界框,其中,该预定的策略可使用现有的策略,例如,选择性搜索等。
由上述实施例可知,基于视频图像中的运动信息来生成目标物体的边界框,能够有效降低计算复杂度且能够获得较高的精度。
实施例2
本发明实施例还提供了一种电子设备,图11是本发明实施例2的电子设备的示意图。如图11所示,电子设备1100包括边界框生成装置1101,其中,边界框生成装 置1101的结构和功能与实施例1中的记载相同,此处不再赘述。
在本实施例中,该电子设备例如是摄像头、数码相机、数码摄像机、智能手机等具有图像捕获功能的设备。
图12是本发明实施例2的电子设备的***构成的一示意框图。如图12所示,电子设备1200可以包括中央处理器1201和存储器1202;存储器1202耦合到中央处理器1201。该图是示例性的;还可以使用其它类型的结构,来补充或代替该结构,以实现电信功能或其它功能。
如图12所示,该电子设备1200还可以包括:输入单元1203、显示器1204、电源1205。
在一个实施方式中,实施例1所述的边界框生成装置的功能可以被集成到中央处理器1201中。其中,中央处理器1201可以被配置为:对视频图像进行运动检测,获得所述视频图像的运动信息图像;根据所述运动信息图像中的运动信息,生成所述视频图像中的目标物体的边界框。
其中,所述根据所述运动信息图像中的运动信息,生成所述视频图像中的目标物体的边界框,包括:生成所述视频图像中的目标物体的边界框;根据所述运动信息图像中的运动信息,对生成的边界框进行选择。
其中,所述根据所述运动信息图像中的运动信息,对生成的边界框进行选择,包括:根据各个边界框中的运动信息量,进行边界框的选择;或根据各个边界框中的运行信息占有率,进行边界框的选择。
其中,所述根据各个边界框中的运动信息量,进行边界框的选择,包括:将所述运动信息量大于或等于第一阈值的边界框作为选择的边界框;所述根据各个边界框中的运行信息占有率,进行边界框的选择,包括:将所述运动信息占有率大于或等于第二阈值的边界框作为选择的边界框。
其中,所述根据所述运动信息图像中的运动信息,生成所述视频图像中的目标物体的边界框,包括:根据所述运动信息图像中的运动信息,提取所述视频图像中具有目标物体的子图像;针对所述具有目标物体的子图像,生成所述视频图像中的目标物体的边界框。
其中,所述根据所述运动信息图像中的运动信息,生成所述视频图像中的目标物体的边界框,包括:根据所述运动信息图像中的运动信息选择所述视频图像的特征进 行预处理;根据预先设定的策略对预处理后的特征进行处理,生成所述视频图像中的目标物体的边界框。
在另一个实施方式中,实施例1所述的边界框生成装置可以与中央处理器1201分开配置,例如可以将边界框生成装置配置为与中央处理器1201连接的芯片,通过中央处理器1201的控制来实现边界框生成装置的功能。
在本实施例中电子设备1200也并不是必须要包括图12中所示的所有部件。
如图12所示,中央处理器1201有时也称为控制器或操作控件,可以包括微处理器或其它处理器装置和/或逻辑装置,中央处理器1201接收输入并控制电子设备1200的各个部件的操作。
存储器1202,例如可以是缓存器、闪存、硬驱、可移动介质、易失性存储器、非易失性存储器或其它合适装置中的一种或更多种。并且中央处理器1201可执行该存储器1202存储的该程序,以实现信息存储或处理等。其它部件的功能与现有类似,此处不再赘述。电子设备1200的各部件可以通过专用硬件、固件、软件或其结合来实现,而不偏离本发明的范围。
由上述实施例可知,基于视频图像中的运动信息来生成目标物体的边界框,能够有效降低计算复杂度且能够获得较高的精度。
实施例3
本发明实施例还提供一种边界框生成方法,其对应于实施例1的边界框生成装置。图13是本发明实施例3的边界框生成方法的流程图。如图13所示,该方法包括:
步骤1301:对视频图像进行运动检测,获得该视频图像的运动信息图像;
步骤1302:根据该运动信息图像中的运动信息,生成该视频图像中的目标物体的边界框。
在本实施例中,对视频图像进行运动检测的方法、根据运动信息生成边界框的方法与实施例1中的记载相同,此处不再赘述。
由上述实施例可知,基于视频图像中的运动信息来生成目标物体的边界框,能够有效降低计算复杂度且能够获得较高的精度。
实施例4
本发明实施例还提供一种边界框生成方法,其对应于实施例1的边界框生成装置。图14是本发明实施例4的边界框生成方法的流程图。如图14所示,该方法包括:
步骤1401:对输入的视频图像进行运动检测,获得该视频图像的运动信息图像;
步骤1402:生成该视频图像中的目标物体的边界框;
步骤1403:根据该运动信息图像中的运动信息,对生成的边界框进行选择。
在本实施例中,对视频图像进行运动检测的方法、生成边界框的方法以及根据运动信息选择边界框的方法与实施例1中的记载相同,此处不再赘述。
由上述实施例可知,基于视频图像中的运动信息来生成目标物体的边界框,能够有效降低计算复杂度且能够获得较高的精度。
实施例5
本发明实施例还提供一种边界框生成方法,其对应于实施例1的边界框生成装置。图15是本发明实施例5的边界框生成方法的流程图。如图15所示,该方法包括:
步骤1501:对输入的视频图像进行运动检测,获得该视频图像的运动信息图像;
步骤1502:根据该运动信息图像中的运动信息,提取该视频图像中具有目标物体的子图像;
步骤1503:针对所述具有目标物体的子图像,生成所述视频图像中的目标物体的边界框。
在本实施例中,对视频图像进行运动检测的方法、提取子图像的方法以及针对子图像生成边界框的方法与实施例1中的记载相同,此处不再赘述。
由上述实施例可知,基于视频图像中的运动信息来生成目标物体的边界框,能够有效降低计算复杂度且能够获得较高的精度。
实施例6
本发明实施例还提供一种边界框生成方法,其对应于实施例1的边界框生成装置。图16是本发明实施例6的边界框生成方法的流程图。如图16所示,该方法包括:
步骤1601:对输入的视频图像进行运动检测,获得该视频图像的运动信息图像;
步骤1602:根据该运动信息图像中的运动信息选择该视频图像的特征进行预处理;
步骤1603:根据预先设定的策略对预处理后的特征进行处理,生成该视频图像中的目标物体的边界框。
在本实施例中,对视频图像进行运动检测的方法、选择该视频图像的特征的方法、对选择的特征进行预处理的方法以及根据预先设定的策略对预处理后的特征进行处 理的方法与实施例1中的记载相同,此处不再赘述。
由上述实施例可知,基于视频图像中的运动信息来生成目标物体的边界框,能够有效降低计算复杂度且能够获得较高的精度。
本发明实施例还提供一种计算机可读程序,其中当在边界框生成装置或电子设备中执行所述程序时,所述程序使得计算机在所述边界框生成装置或电子设备中执行实施例3至实施例6中任一实施例所述的边界框生成方法。
本发明实施例还提供一种存储有计算机可读程序的存储介质,其中所述计算机可读程序使得计算机在边界框生成装置或电子设备中执行实施例3至实施例6中任一实施例所述的边界框生成方法。
本发明以上的装置和方法可以由硬件实现,也可以由硬件结合软件实现。本发明涉及这样的计算机可读程序,当该程序被逻辑部件所执行时,能够使该逻辑部件实现上文所述的装置或构成部件,或使该逻辑部件实现上文所述的各种方法或步骤。本发明还涉及用于存储以上程序的存储介质,如硬盘、磁盘、光盘、DVD、flash存储器等。
以上结合具体的实施方式对本发明进行了描述,但本领域技术人员应该清楚,这些描述都是示例性的,并不是对本发明保护范围的限制。本领域技术人员可以根据本发明的精神和原理对本发明做出各种变型和修改,这些变型和修改也在本发明的范围内。

Claims (13)

  1. 一种边界框生成装置,所述装置包括:
    检测单元,所述检测单元用于对视频图像进行运动检测,获得所述视频图像的运动信息图像;
    生成单元,所述生成单元用于根据所述运动信息图像中的运动信息,生成所述视频图像中的目标物体的边界框。
  2. 根据权利要求1所述的装置,其中,所述生成单元包括:
    第一生成单元,所述第一生成单元用于生成所述视频图像中的目标物体的边界框;
    第一选择单元,所述第一选择单元用于根据所述运动信息图像中的运动信息,对生成的边界框进行选择。
  3. 根据权利要求2所述的装置,其中,所述第一选择单元包括:
    第二选择单元,所述第二选择单元用于根据各个边界框中的运动信息量,进行边界框的选择;或
    第三选择单元,所述第三选择单元用于根据各个边界框中的运行信息占有率,进行边界框的选择。
  4. 根据权利要求3所述的装置,其中,
    所述第二选择单元将所述运动信息量大于或等于第一阈值的边界框作为选择的边界框;
    所述第三选择单元将所述运动信息占有率大于或等于第二阈值的边界框作为选择的边界框。
  5. 根据权利要求1所述的装置,其中,所述生成单元包括:
    提取单元,所述提取单元用于根据所述运动信息图像中的运动信息,提取所述视频图像中具有目标物体的子图像;
    第二生成单元,所述第二生成单元用于针对所述具有目标物体的子图像,生成所述视频图像中的目标物体的边界框。
  6. 根据权利要求1所述的装置,其中,所述生成单元包括:
    预处理单元,所述预处理单元用于根据所述运动信息图像中的运动信息选择所述视频图像的特征进行预处理;
    第三生成单元,所述第三生成单元用于根据预先设定的策略对预处理后的特征进行处理,生成所述视频图像中的目标物体的边界框。
  7. 一种电子设备,包括根据权利要求1-6的任一项所述的边界框生成装置。
  8. 一种边界框生成方法,所述方法包括:
    对视频图像进行运动检测,获得所述视频图像的运动信息图像;
    根据所述运动信息图像中的运动信息,生成所述视频图像中的目标物体的边界框。
  9. 根据权利要求8所述的方法,其中,所述根据所述运动信息图像中的运动信息,生成所述视频图像中的目标物体的边界框,包括:
    生成所述视频图像中的目标物体的边界框;
    根据所述运动信息图像中的运动信息,对生成的边界框进行选择。
  10. 根据权利要求9所述的方法,其中,所述根据所述运动信息图像中的运动信息,对生成的边界框进行选择,包括:
    根据各个边界框中的运动信息量,进行边界框的选择;或
    根据各个边界框中的运行信息占有率,进行边界框的选择。
  11. 根据权利要求10所述的方法,其中,
    所述根据各个边界框中的运动信息量,进行边界框的选择,包括:将所述运动信息量大于或等于第一阈值的边界框作为选择的边界框;
    所述根据各个边界框中的运行信息占有率,进行边界框的选择,包括:将所述运动信息占有率大于或等于第二阈值的边界框作为选择的边界框。
  12. 根据权利要求8所述的方法,其中,所述根据所述运动信息图像中的运动信息,生成所述视频图像中的目标物体的边界框,包括:
    根据所述运动信息图像中的运动信息,提取所述视频图像中具有目标物体的子图像;
    针对所述具有目标物体的子图像,生成所述视频图像中的目标物体的边界框。
  13. 根据权利要求8所述的方法,其中,所述根据所述运动信息图像中的运动信息,生成所述视频图像中的目标物体的边界框,包括:
    根据所述运动信息图像中的运动信息选择所述视频图像的特征进行预处理;
    根据预先设定的策略对预处理后的特征进行处理,生成所述视频图像中的目标物体的边界框。
PCT/CN2016/070479 2016-01-08 2016-01-08 边界框生成装置及方法 WO2017117801A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/070479 WO2017117801A1 (zh) 2016-01-08 2016-01-08 边界框生成装置及方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/070479 WO2017117801A1 (zh) 2016-01-08 2016-01-08 边界框生成装置及方法

Publications (1)

Publication Number Publication Date
WO2017117801A1 true WO2017117801A1 (zh) 2017-07-13

Family

ID=59273256

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/070479 WO2017117801A1 (zh) 2016-01-08 2016-01-08 边界框生成装置及方法

Country Status (1)

Country Link
WO (1) WO2017117801A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110298238A (zh) * 2019-05-20 2019-10-01 平安科技(深圳)有限公司 行人视觉跟踪方法、模型训练方法、装置、设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663362A (zh) * 2012-04-09 2012-09-12 宁波中科集成电路设计中心有限公司 一种基于灰度特征的运动目标检测方法
CN102799857A (zh) * 2012-06-19 2012-11-28 东南大学 一种视频多车辆轮廓检测方法
CN103955949A (zh) * 2014-04-04 2014-07-30 哈尔滨工程大学 基于Mean-shift算法的运动目标检测方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663362A (zh) * 2012-04-09 2012-09-12 宁波中科集成电路设计中心有限公司 一种基于灰度特征的运动目标检测方法
CN102799857A (zh) * 2012-06-19 2012-11-28 东南大学 一种视频多车辆轮廓检测方法
CN103955949A (zh) * 2014-04-04 2014-07-30 哈尔滨工程大学 基于Mean-shift算法的运动目标检测方法

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110298238A (zh) * 2019-05-20 2019-10-01 平安科技(深圳)有限公司 行人视觉跟踪方法、模型训练方法、装置、设备及存储介质
WO2020232909A1 (zh) * 2019-05-20 2020-11-26 平安科技(深圳)有限公司 行人视觉跟踪方法、模型训练方法、装置、设备及存储介质
CN110298238B (zh) * 2019-05-20 2023-06-30 平安科技(深圳)有限公司 行人视觉跟踪方法、模型训练方法、装置、设备及存储介质

Similar Documents

Publication Publication Date Title
US9299004B2 (en) Image foreground detection
CN101416219B (zh) 数字图像中的前景/背景分割
CN111091091A (zh) 目标对象重识别特征的提取方法、装置、设备及存储介质
CN111583097A (zh) 图像处理方法、装置、电子设备及计算机可读存储介质
US8718356B2 (en) Method and apparatus for 2D to 3D conversion using scene classification and face detection
CN108876753B (zh) 使用引导图像对合成长曝光图像进行可选增强
US20150117783A1 (en) Iterative saliency map estimation
CN107025660B (zh) 一种确定双目动态视觉传感器图像视差的方法和装置
CN109558901B (zh) 一种语义分割训练方法及装置、电子设备、存储介质
WO2017201751A1 (zh) 虚拟视点视频、图像的空洞填充方法、装置和终端
CN110675407B (zh) 一种图像实例分割方法、装置、电子设备及存储介质
US10389936B2 (en) Focus stacking of captured images
CN115004242A (zh) 同时进行实时对象检测和语义分割的***和方法
CN115861380B (zh) 雾天低照度场景下端到端无人机视觉目标跟踪方法及装置
CN111696110A (zh) 场景分割方法及***
CN111028170A (zh) 图像处理方法、图像处理装置、电子设备和可读存储介质
CN112384928A (zh) 对图像执行对象照明操纵的方法和装置
CN114419102B (zh) 一种基于帧差时序运动信息的多目标跟踪检测方法
KR100837435B1 (ko) 촬영장치 및 이의 대상 추적제어방법
US10708600B2 (en) Region of interest determination in video
US20150249779A1 (en) Smoothing of ghost maps in a ghost artifact detection method for hdr image creation
WO2017117801A1 (zh) 边界框生成装置及方法
CN109934342B (zh) 神经网络模型训练方法、深度图像修复方法及***
KR101592087B1 (ko) 배경 영상의 위치를 이용한 관심맵 생성 방법 및 이를 기록한 기록 매체
Yu et al. Hmflow: hybrid matching optical flow network for small and fast-moving objects

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16882946

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16882946

Country of ref document: EP

Kind code of ref document: A1