WO2023160061A1 - 图像中运动对象的确定方法、装置、电子设备和存储介质 - Google Patents

图像中运动对象的确定方法、装置、电子设备和存储介质 Download PDF

Info

Publication number
WO2023160061A1
WO2023160061A1 PCT/CN2022/134488 CN2022134488W WO2023160061A1 WO 2023160061 A1 WO2023160061 A1 WO 2023160061A1 CN 2022134488 W CN2022134488 W CN 2022134488W WO 2023160061 A1 WO2023160061 A1 WO 2023160061A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
moving
frame
candidate
video image
Prior art date
Application number
PCT/CN2022/134488
Other languages
English (en)
French (fr)
Inventor
郑少杰
于伟
陈智勇
王林芳
杨琛
梅涛
Original Assignee
京东科技信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东科技信息技术有限公司 filed Critical 京东科技信息技术有限公司
Publication of WO2023160061A1 publication Critical patent/WO2023160061A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/187Segmentation; Edge detection involving region growing; involving region merging; involving connected component labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Definitions

  • the present disclosure relates to the technical field of computer vision, and in particular to a method, device, electronic equipment, storage medium, computer program product and computer program for determining a moving object in an image.
  • the disclosure proposes a method, device, electronic device, storage medium, computer program product and computer program for determining a moving object in an image.
  • the embodiment of the first aspect of the present disclosure proposes a method for determining a moving object in an image, the method including: determining a plurality of first candidate frames of the target object in the current frame video image, wherein the first candidate frames are used for Mark the image area occupied by the target object in the current frame image; determine the moving foreground image of the current frame video image; determine the moving target object in the current frame video image according to the moving foreground image The second candidate frame; according to the moving foreground image, filter a plurality of the first candidate frames to obtain a third candidate frame, wherein the area of the moving foreground pixels in the third candidate frame is larger than the first predicted frame Setting a threshold; determining a target frame of the moving target object according to the second candidate frame and the third candidate frame.
  • the determining the target frame of the moving target object according to the second candidate frame and the third candidate frame includes: determining the second candidate frame and the An overlapping area between the third candidate frames; when the area of the overlapping area is greater than a second preset threshold, the second candidate frame or the third candidate frame is used as the target of the moving target object frame.
  • the method further includes: when the area of the overlapping region is less than or equal to a second preset threshold, using both the second candidate frame and the third candidate frame as The target box for the target object of the motion.
  • determining the second candidate frame of the moving target object in the current frame video image according to the moving foreground images includes: determining each The connected regions of the moving foreground image, and calculate the total area of the connected region in each of the moving foreground images; determine the largest circumscribed rectangular frame of each of the moving foreground images; from a plurality of the moving foreground images, obtain The ratio of the total area of the connected region to the area of the largest circumscribed rectangular frame is greater than the third preset threshold for the target moving foreground image; the largest circumscribed rectangular frame corresponding to the target moving foreground image is determined as the moving target object The second candidate box.
  • the determining the moving foreground image of the current frame video image includes: acquiring a previous frame video image of the current frame video image; The background image of the current frame video image; the foreground recognition is performed on the current frame video image according to the background image to obtain the moving foreground image of the current frame video image, wherein the moving foreground image is used to characterize the The change of the current frame video image relative to the background image.
  • This disclosure proposes a method for determining a moving object in an image.
  • a plurality of first candidate frames of the target object in the current frame video image are determined, and determined
  • the moving foreground image of the current frame video image to determine the second candidate frame of the moving target object in the current frame video image according to the moving foreground image, and filter a plurality of first candidate frames to obtain the third candidate frame, with According to the second candidate frame and the third candidate frame, the target frame of the moving target object is determined.
  • the filtering of the candidate frames of the target object in the current frame video image reduces a large number of useless candidate frames, and is useful for determining the target corresponding to the moving target object in the current frame video image.
  • the influence of the frame is realized to accurately determine the target frame of the moving target object in the current frame video image, and at the same time, the calculation efficiency is improved.
  • the embodiment of the second aspect of the present disclosure proposes a device for determining a moving object in an image
  • the device includes: a first determining module, configured to determine a plurality of first candidate frames of a target object in a current frame video image, wherein the The first candidate frame is used to mark the image area occupied by the target object in the current frame image; the second determination module is used to determine the moving foreground image of the current frame video image; the third determination module is used Determine the second candidate frame of the moving target object in the current frame video image according to the moving foreground image; the filtering module is configured to filter a plurality of the first candidate frames according to the moving foreground image, to obtain a third candidate frame, wherein the area of the moving foreground pixel in the third candidate frame is greater than a first preset threshold; a fourth determination module is configured to use the second candidate frame and the third candidate frame , determine the target frame of the moving target object.
  • the fourth determining module includes: a determining unit configured to determine an overlapping area between the second candidate frame and the third candidate frame; a first generating unit configured to If the area of the overlapping region is greater than a second preset threshold, the second candidate frame or the third candidate frame is used as a target frame of the moving target object.
  • the fourth determining module further includes: a second generating unit, configured to combine the second candidate frame and the third candidate frame serve as the target frame of the moving target object.
  • the second determination module is specifically configured to: determine the connected regions of each moving foreground image, and calculate each of the moving foreground images The total area of connected regions in the middle; determine the maximum circumscribed rectangular frame of each of the moving foreground images; from multiple described moving foreground images, obtain the ratio of the total area of connected regions to the area of the largest circumscribed rectangular frame greater than the third predetermined Setting a threshold for the target moving foreground image; determining the largest circumscribed rectangular frame corresponding to the target moving foreground image as the second candidate frame of the moving target object.
  • the first determining module is specifically configured to: acquire a previous frame video image of the current frame video image; determine the current frame video image according to the previous frame video image background image; perform foreground recognition on the current frame video image according to the background image, to obtain the moving foreground image of the current frame video image, wherein the moving foreground image is used to characterize the relative Changes to the background image.
  • the present disclosure proposes a device for determining a moving object in an image.
  • a plurality of first candidate frames of the target object in the current frame video image are determined, and determined
  • the moving foreground image of the current frame video image to determine the second candidate frame of the moving target object in the current frame video image according to the moving foreground image, and filter a plurality of first candidate frames to obtain the third candidate frame, with According to the second candidate frame and the third candidate frame, the target frame of the moving target object is determined.
  • the filtering of the candidate frames of the target object in the current frame video image reduces a large number of useless candidate frames, and is useful for determining the target corresponding to the moving target object in the current frame video image.
  • the influence of the frame is realized to accurately determine the target frame of the moving target object in the current frame video image, and at the same time, the calculation efficiency is improved.
  • the embodiment of the third aspect of the present disclosure proposes an electronic device, including: a memory, a processor, and a computer program stored in the memory and operable on the processor.
  • the processor executes the program, the implementation of the present disclosure is realized.
  • the embodiment of the fourth aspect of the present disclosure provides a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the method for determining a moving object in an image in the embodiment of the present disclosure is provided.
  • the embodiment of the fifth aspect of the present disclosure provides a computer program product, including a computer program.
  • the computer program is executed by a processor, the method for determining a moving object in an image as in the embodiment of the present disclosure is implemented.
  • the embodiment of the sixth aspect of the present disclosure provides a computer program, the computer program includes computer program code, and when the computer program code is run on the computer, the computer executes the moving object in the image as in the embodiment of the present disclosure. Determine the method.
  • FIG. 1 is a schematic flowchart of a method for determining a moving object in an image provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of another method for determining a moving object in an image provided by an embodiment of the present disclosure
  • Fig. 3 is a schematic flowchart of another method for determining a moving object in an image provided by an embodiment of the present disclosure
  • Fig. 4 is an example diagram of a moving foreground image provided by an embodiment of the present disclosure
  • FIG. 5 is a schematic structural diagram of an apparatus for determining a moving object in an image provided by an embodiment of the present disclosure
  • Fig. 6 is a schematic structural diagram of another device for determining a moving object in an image provided by an embodiment of the present disclosure
  • FIG. 7 is a block diagram of an electronic device according to one embodiment of the present disclosure.
  • Fig. 1 is a schematic flowchart of a method for determining a moving object in an image provided by an embodiment of the present disclosure.
  • the method for determining a moving object in an image provided by this embodiment is executed by a device for determining a moving object in an image, and the device for determining a moving object in an image may be implemented by software and/or hardware.
  • the apparatus for determining a moving object in an image in this embodiment may be configured in an electronic device, and the electronic device in this embodiment may include a server, and this embodiment does not specifically limit the electronic device.
  • Fig. 1 is a schematic flowchart of a method for determining a moving object in an image provided by an embodiment of the present disclosure.
  • the method for determining a moving object in an image may include: steps 101-105.
  • Step 101 Determine a plurality of first candidate frames of a target object in a current frame video image, wherein the first candidate frames are used to mark the image area occupied by the target object in the current frame image.
  • the above-mentioned determination of multiple first candidate frames of the target object in the current frame video image can be implemented in various ways, and the exemplary description is as follows:
  • the current frame video image may be input into a pre-trained neural network model, so as to obtain a plurality of first candidate frames of the target object in the current frame video image through the neural network model.
  • the current frame video image may be processed based on the features of the target object to obtain an area matching the features of the target object, and a candidate frame of the target object is determined according to the area.
  • Step 102 determine the moving foreground image of the current frame video image.
  • the determination of the moving foreground image of the current frame video image can be implemented in various ways, such as frame difference method, background modeling method, optical flow method, but not limited thereto.
  • Step 103 according to the moving foreground image, determine the second candidate frame of the moving target object in the current frame video image.
  • the package during violent sorting, when the current package is in motion, the package is in a state of high-speed motion, which will cause motion blur.
  • the moving foreground image of the current frame video image and the video image of the current frame of the package can be obtained, and segmented according to the change information of the image, and then the current frame video is determined by calculating the result of the segmented area.
  • the second candidate frame of the moving target object in the image so as to accurately determine the second candidate frame of the moving target object in the current frame video image based on the moving foreground image.
  • Step 104 Filter a plurality of first candidate frames according to the moving foreground image to obtain a third candidate frame, wherein the area of the moving foreground pixel in the third candidate frame is greater than a first preset threshold.
  • an implementation manner of filtering a plurality of first candidate frames to obtain a third candidate frame according to the moving foreground image is that, for each first candidate frame, determining the first candidate frame according to the moving foreground image An area where the pixel point in a candidate frame is 1, and if the area is smaller than the first preset threshold value, it is judged that there is no moving target object in the first candidate frame, and the first candidate frame is filtered out, In this way, a plurality of first candidate frames are filtered to obtain filtered third candidate frames, thereby effectively reducing useless calculations of a large number of static packages and saving computing resources during the detection process of violent sorting of packages.
  • the first preset threshold may be set in combination with an actual scene, which is not specifically limited in this embodiment.
  • Step 105 determine the target frame of the moving target object according to the second candidate frame and the third candidate frame.
  • the second candidate frame is fused with the third candidate frame, and the target frame of the moving target object is determined according to the overlapping area of the second candidate frame and the third candidate frame, that is, by combining the static image
  • the combination of the detection results and the dynamic motion information filters out a large number of useless stationary objects identified by "violent sorting", which effectively reduces the amount of calculation and improves the calculation efficiency.
  • This disclosure proposes a method for determining a moving object in an image.
  • a plurality of first candidate frames of the target object in the current frame video image are determined, and determined
  • the moving foreground image of the current frame video image to determine the second candidate frame of the moving target object in the current frame video image according to the moving foreground image, and filter a plurality of first candidate frames to obtain the third candidate frame, with According to the second candidate frame and the third candidate frame, the target frame of the moving target object is determined.
  • the filtering of the candidate frames of the target object in the current frame video image reduces a large number of useless candidate frames, and is useful for determining the target corresponding to the moving target object in the current frame video image.
  • the influence of the frame is realized to accurately determine the target frame of the moving target object in the current frame video image, and at the same time, the calculation efficiency is improved.
  • Fig. 2 is a schematic flowchart of another method for determining a moving object in an image provided by an embodiment of the present disclosure. As shown in FIG. 2, the method for determining a moving object in an image includes steps 201-206.
  • Step 201 determine a plurality of first candidate frames of the target object in the current frame video image, wherein the first candidate frames are used to mark the image area occupied by the target object in the current frame image.
  • step 201 for a specific implementation manner of step 201, reference may be made to relevant descriptions in the foregoing embodiments.
  • Step 202 determine the moving foreground image of the current frame video image.
  • an implementation manner of determining the moving foreground image of the current frame video image is to obtain the previous frame video image of the current frame video image, and determine the background of the current frame video image according to the previous frame video image Image, the foreground recognition is performed on the current frame video image according to the background image to obtain the moving foreground image of the current frame video image, wherein the moving foreground image is used to represent the change of the current frame video image relative to the background image, so that effective and accurate Extract moving objects as foreground from still images.
  • Step 203 determine the second candidate frame of the moving target object in the current frame video image.
  • an implementation manner of determining the second candidate frame of the moving target object in the current frame video image is to determine the connected regions of each moving foreground image, And calculate the total area of connected regions in each moving foreground image, and determine the maximum circumscribed rectangular frame of each moving foreground image, to obtain the total area of connected regions and the area of the largest circumscribed rectangular frame from multiple moving foreground images
  • the ratio of the target motion foreground image greater than the third preset threshold is determined as the second candidate frame of the moving target object in the largest circumscribed rectangular frame corresponding to the target motion foreground image, thereby based on the connected calculation of the motion foreground image, with Accurately determine the second candidate frame of the moving target object in the current frame video image.
  • Step 204 Filter a plurality of first candidate frames according to the moving foreground image to obtain a third candidate frame, wherein the area of the moving foreground pixel in the third candidate frame is greater than a first preset threshold.
  • the area of the moving foreground pixel in each first candidate frame can be determined according to the moving foreground image, and the area of the moving foreground pixel in the first candidate frame can be compared with the first pixel corresponding to the target object. Compared with the preset threshold, if the area of the moving foreground pixel in the first candidate frame is greater than the first preset threshold, the first candidate frame is used as the third candidate frame.
  • the specific method may be: by obtaining the position information of the first candidate frame in the current frame video image, from the moving The area of the moving foreground pixel in the position information is determined in the foreground image to obtain the area of the moving foreground pixel in the first candidate frame.
  • Step 205 determining the overlapping area between the second candidate frame and the third candidate frame.
  • Step 206 if the area of the overlapping region is larger than the second preset threshold, use the second candidate frame or the third candidate frame as the target frame of the moving target object.
  • both the second candidate frame and the third candidate frame are used as the target frame of the moving target object, so that according to the second candidate frame and The overlapping area between the third candidate frames accurately determines the target object target frame.
  • This disclosure proposes a method for determining a moving object in an image, which determines a plurality of first candidate frames of the target object in the current frame video image, and determines the moving foreground image of the current frame video image, so as to determine the current frame based on the moving foreground image.
  • the second candidate frame of the moving target object in the frame video image, and a plurality of first candidate frames are filtered to obtain the third candidate frame, and the area of the overlapping area between the second candidate frame and the third candidate frame is greater than the first
  • the second candidate frame or the third candidate frame is used as the target frame of the moving target object. Therefore, in the process of detecting the motion of the target object, based on the screening of static images and dynamic motion information, the target frame of the target object can be accurately determined.
  • Fig. 3 is a schematic flowchart of another method for determining a moving object in an image provided by an embodiment of the present disclosure. As shown in FIG. 3 , the method for determining the moving object in the image includes: steps 301-306.
  • Step 301 acquire a sequence of video images.
  • an input video stream or a local video file may be decoded and frame-picked to extract an image sequence as an input for target object detection.
  • Step 302 object detection of a single static picture.
  • each frame of static picture in the image sequence collected above is obtained, and target detection is performed to extract the target frame B model (t) wrapped in each frame of image. It can be understood that each frame is extracted
  • the way of the target frame wrapped in a frame of image may include but not limited to target detection and deep learning neural network model.
  • B model (t) represents the target frame result of the static picture predicted by the model
  • t is the frame index of the image sequence
  • each element is the coordinates of the upper left corner and the lower right corner of the target frame (x1, y1, x2, y2 ).
  • Step 303 segmentation of the moving foreground region.
  • the segmentation of the moving foreground region may adopt, but is not limited to, a frame difference method, a background modeling method, and an optical flow method, which are not specifically limited in this embodiment.
  • the frame difference method and the optical flow method both use the current frame picture and the previous frame picture as input to calculate the change information of the image, and then according to an empirical threshold corresponding to the environment, the results of the frame difference method or optical flow method Perform image binary segmentation to obtain a moving foreground image, as shown in Figure 4.
  • the left side of Figure 4 is a static color (red-green-blue, RGB) image of a certain frame
  • the right side of Figure 4 removes
  • the dotted frame part is the segmentation result of the moving foreground
  • the gray area is the moving foreground image F(x, y, t).
  • motion foreground segmentation refers to marking the pixels associated with each independent motion in various motions with sequence features, and clustering these pixels according to the objects they belong to. Extract moving objects as foreground from static background.
  • Step 304 generating an additional target frame of the moving object.
  • the moving foreground segmentation result obtained in step 303 for the gray moving foreground image on the right side of FIG.
  • the rectangular boxes c and d on the right side of Figure 4 due to the high-speed motion characteristics of the package itself, in the moving foreground segmentation result map, it shows that the changing area of the package movement often presents a connected area with few internal holes.
  • the candidate box can be considered as a moving wrapping target.
  • the candidate box c on the right is a real moving package, and the candidate box d is caused by the local movement of the human arm .
  • Step 305 result fusion.
  • the static target object in the target detection result of the static picture is filtered out, wherein, the motion segmentation foreground image obtained in step 303 is a binary image, and the motion Foreground pixels are 1 and background pixels are 0.
  • Step 306 output the moving object detection result.
  • This disclosure proposes a method for determining a moving object in an image.
  • package detection is performed on each frame of static picture, and the moving foreground area is segmented to generate an additional moving package target frame. , and then fuse the moving object frame and the detection frame of the static object detection result to obtain the target candidate frame, and output the moving object detection result. Therefore, in the process of detecting the motion of the target object, the detection result of the static image is fused with the dynamic motion information, so as to accurately obtain and output the detection result of the moving object.
  • Fig. 5 is a schematic structural diagram of an apparatus for determining a moving object in an image provided by an embodiment of the present disclosure.
  • the apparatus 500 for determining a moving object in the image includes: 501-505.
  • the first determination module 501 is configured to determine a plurality of first candidate frames of the target object in the current frame video image, wherein the first candidate frames are used to mark the image area occupied by the target object in the current frame image.
  • the second determination module 502 is configured to determine the moving foreground image of the current frame video image.
  • the third determining module 503 is configured to determine the second candidate frame of the moving target object in the current frame video image according to the moving foreground image.
  • the filtering module 504 is configured to filter a plurality of first candidate frames according to the moving foreground image to obtain a third candidate frame, wherein the area of the moving foreground pixel in the third candidate frame is greater than a first preset threshold.
  • the fourth determination module 505 is configured to determine the target frame of the moving target object according to the second candidate frame and the third candidate frame.
  • the present disclosure proposes a device for determining a moving object in an image.
  • a plurality of first candidate frames of the target object in the current frame video image are determined, and determined
  • the moving foreground image of the current frame video image to determine the second candidate frame of the moving target object in the current frame video image according to the moving foreground image, and filter a plurality of first candidate frames to obtain the third candidate frame, with According to the second candidate frame and the third candidate frame, the target frame of the moving target object is determined.
  • the filtering of the candidate frames of the target object in the current frame video image reduces a large number of useless candidate frames, and is useful for determining the target corresponding to the moving target object in the current frame video image.
  • the influence of the frame is realized to accurately determine the target frame of the moving target object in the current frame video image, and at the same time, the calculation efficiency is improved.
  • the fourth determining module 505 includes:
  • a determining unit 5051 configured to determine an overlapping area between the second candidate frame and the third candidate frame.
  • the first generation unit 5052 is configured to use the second candidate frame or the third candidate frame as the target frame of the moving target object when the area of the overlapping region is larger than the second preset threshold.
  • the fourth determining module 505 further includes:
  • the second generation unit 5053 is configured to use both the second candidate frame and the third candidate frame as the target frame of the moving target object when the area of the overlapping region is less than or equal to the second preset threshold.
  • the second determination module 502 is specifically used for:
  • the connected regions of each moving foreground image are determined, and the total area of the connected regions in each moving foreground image is calculated.
  • the largest circumscribed rectangular frame corresponding to the target moving foreground image is determined as the second candidate frame of the moving target object.
  • the first determination module 501 is specifically used for:
  • This disclosure proposes a method for determining a moving object in an image.
  • a plurality of first candidate frames of the target object in the current frame video image are determined, and determined
  • the moving foreground image of the current frame video image to determine the second candidate frame of the moving target object in the current frame video image according to the moving foreground image, and filter a plurality of first candidate frames to obtain the third candidate frame, with According to the second candidate frame and the third candidate frame, the target frame of the moving target object is determined.
  • the filtering of the candidate frames of the target object in the current frame video image reduces a large number of useless candidate frames, and is useful for determining the target corresponding to the moving target object in the current frame video image.
  • the influence of the frame is realized to accurately determine the target frame of the moving target object in the current frame video image, and at the same time, the calculation efficiency is improved.
  • FIG. 7 it is a block diagram of an electronic device according to an embodiment of the present disclosure.
  • the electronic equipment includes:
  • Memory 701 Memory 701 , processor 702 , and computer instructions stored on memory 701 and executable on processor 702 .
  • the electronic equipment also includes:
  • the communication interface 703 is used for communication between the memory 701 and the processor 702 .
  • the memory 701 is used for storing computer instructions that can be executed on the processor 702 .
  • the memory 701 may include a high-speed RAM memory, and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.
  • the processor 702 is configured to implement the method for determining a moving object in an image in the above embodiment when executing a program.
  • the bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus.
  • ISA Industry Standard Architecture
  • PCI Peripheral Component
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in FIG. 7 , but it does not mean that there is only one bus or one type of bus.
  • the memory 701, the processor 702 and the communication interface 703 are integrated on one chip, the memory 701, the processor 702 and the communication interface 703 can communicate with each other through the internal interface .
  • the processor 702 may be a central processing unit (Central Processing Unit, referred to as CPU), or a specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC), or configured to implement one or more of the embodiments of the present disclosure. integrated circuit.
  • CPU Central Processing Unit
  • ASIC Application Specific Integrated Circuit
  • Embodiments of the present disclosure also provide a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the method for determining a moving object in an image as proposed in the foregoing embodiments of the present disclosure is implemented.
  • the computer readable storage medium is a non-transitory computer readable storage medium.
  • Embodiments of the present disclosure also provide a computer program product, including a computer program, when the computer program is executed by a processor, the method for determining a moving object in an image as provided in the foregoing embodiments of the present disclosure is implemented.
  • An embodiment of the present disclosure also provides a computer program, wherein the computer program includes computer program code, and when the computer program code is run on a computer, the computer is made to execute the method for determining a moving object in an image as provided in the foregoing embodiments of the present disclosure.
  • first and second are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features.
  • the features defined as “first” and “second” may explicitly or implicitly include at least one of these features.
  • “plurality” means at least two, such as two, three, etc., unless otherwise specifically defined.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

提供一种图像中运动对象的确定方法、装置、电子设备、存储介质、计算机程序产品和计算机程序,其中,该方法包括:在确定当前帧视频图像中运动的目标对象的目标框的过程中,确定出当前帧视频图像中目标对象的多个第一候选框,并确定当前帧视频图像的运动前景图像,以根据运动前景图像,确定出当前帧视频图像中运动的目标对象的第二候选框,并对多个第一候选框进行过滤,得到第三候选框,以根据第二候选框和第三候选框,确定出运动的目标对象的目标框。

Description

图像中运动对象的确定方法、装置、电子设备和存储介质
相关申请的交叉引用
本申请基于申请号为2022101798460、申请日为2022年2月25日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本公开涉及计算机视觉技术领域,具体涉及一种图像中运动对象的确定方法、装置、电子设备、存储介质、计算机程序产品和计算机程序。
背景技术
近年来,随着物流行业的不断发展,在对物品进行分拣的过程中,频繁出现暴力分拣的现象。相关技术中,通常对分拣过程进行视频采集,并对视频中每一帧静态图像进行对象检测,并根据对象检测结果,确定是否属于“暴力分拣行为”。在确定暴力分拣行为的过程中,仅对每一帧静态图像进行对象检测,对象检测结果中存在无用数据,且不够准确,容易导致很多运动包裹被漏检,容易遗漏“暴力分拣行为”。因此,如何准确确定出每一帧图像中所需的目标对象,对于实现准确检测暴力分拣行为是十分重要的。
发明内容
本公开提出一种图像中运动对象的确定方法、装置、电子设备、存储介质、计算机程序产品和计算机程序。
本公开第一方面实施例提出了一种图像中运动对象的确定方法,所述方法包括:确定当前帧视频图像中目标对象的多个第一候选框,其中,所述第一候选框用于标注所述目标对象在所述当前帧图像中所占用的图像区域;确定所述当前帧视频图像的运动前景图像;根据所述运动前景图像,确定所述当前帧视频图像中运动的目标对象的第二候选框;根据所述运动前景图像,对多个所述第一候选框进行过滤,以得到第三候选框,其中,所述第三候选框中运动前景像素点的面积大于第一预设阈值;根据所述第二候选框和所述第三候选框,确定出所述运动的目标对象的目标框。
在本公开的一个实施例中,所述根据所述第二候选框和所述第三候选框,确定出所述运动的目标对象的目标框,包括:确定所述第二候选框和所述第三候选框之间的重叠区域;在所述重叠区域的面积大于第二预设阈值的情况下,将所述第二候选框或者所述第三候选 框作为所述运动的目标对象的目标框。
在本公开的一个实施例中,所述方法还包括:在所述重叠区域的面积小于或者等于第二预设阈值的情况下,将所述第二候选框和所述第三候选框均作为所述运动的目标对象的目标框。
在本公开的一个实施例中,所述运动前景图像为多个,所述根据所述运动前景图像,确定所述当前帧视频图像中运动的目标对象的第二候选框,包括:确定每个所述运动前景图像的连通区域,并计算每个所述运动前景图像中连通区域的总面积;确定每个所述运动前景图像的最大外接矩形框;从多个所述运动前景图像中,获取连通区域的总面积与最大外接矩形框的面积之比大于第三预设阈值的目标运动前景图像;将所述目标运动前景图像所对应的最大外接矩形框,确定为所述运动的目标对象的第二候选框。
在本公开的一个实施例中,所述确定所述当前帧视频图像的运动前景图像,包括:获取所述当前帧视频图像的上一帧视频图像;根据所述上一帧视频图像,确定所述当前帧视频图像的背景图像;根据所述背景图像对所述当前帧视频图像进行前景识别,以得到所述当前帧视频图像的运动前景图像,其中,所述运动前景图像用于表征所述当前帧视频图像相对于所述背景图像的变化。
本公开提出一种图像中运动对象的确定方法,在确定当前帧视频图像中运动的目标对象的目标框的过程中,确定出当前帧视频图像中目标对象的多个第一候选框,并确定当前帧视频图像的运动前景图像,以根据运动前景图像,确定出当前帧视频图像中运动的目标对象的第二候选框,并对多个第一候选框进行过滤,得到第三候选框,以根据第二候选框和第三候选框,确定出运动的目标对象的目标框。由此,结合当前帧视频图像的运动前景图像,对当前帧视频图像中目标对象的候选框的过滤,减少了大量无用的候选框,对确定当前帧视频图像中运动的目标对象所对应的目标框的影响,实现准确确定出当前帧视频图像中运动的目标对象的目标框的同时,提高了计算效率。
本公开第二方面实施例提出了一种图像中运动对象的确定装置,所述装置包括:第一确定模块,用于确定当前帧视频图像中目标对象的多个第一候选框,其中,所述第一候选框用于标注所述目标对象在所述当前帧图像中所占用的图像区域;第二确定模块,用于确定所述当前帧视频图像的运动前景图像;第三确定模块,用于根据所述运动前景图像,确定所述当前帧视频图像中运动的目标对象的第二候选框;过滤模块,用于根据所述运动前景图像,对多个所述第一候选框进行过滤,以得到第三候选框,其中,所述第三候选框中运动前景像素点的面积大于第一预设阈值;第四确定模块,用于根据所述第二候选框和所述第三候选框,确定出所述运动的目标对象的目标框。
在本公开的一个实施例中,所述第四确定模块,包括:确定单元,用于确定所述第二候选框和所述第三候选框之间的重叠区域;第一生成单元,用于在所述重叠区域的面积大于第二预设阈值的情况下,将所述第二候选框或者所述第三候选框作为所述运动的目标对象的目标框。
在本公开的一个实施例中,所述第四确定模块还包括:第二生成单元,用于在所述重叠区域的面积小于或者等于第二预设阈值的情况下,将所述第二候选框和所述第三候选框均作为所述运动的目标对象的目标框。
在本公开的一个实施例中,所述运动前景图像为多个,所述第二确定模块,具体用于:确定每个所述运动前景图像的连通区域,并计算每个所述运动前景图像中连通区域的总面积;确定每个所述运动前景图像的最大外接矩形框;从多个所述运动前景图像中,获取连通区域的总面积与最大外接矩形框的面积之比大于第三预设阈值的目标运动前景图像;将所述目标运动前景图像所对应的最大外接矩形框,确定为所述运动的目标对象的第二候选框。
在本公开的一个实施例中,所述第一确定模块,具体用于:获取所述当前帧视频图像的上一帧视频图像;根据所述上一帧视频图像,确定所述当前帧视频图像的背景图像;根据所述背景图像对所述当前帧视频图像进行前景识别,以得到所述当前帧视频图像的运动前景图像,其中,所述运动前景图像用于表征所述当前帧视频图像相对于所述背景图像的变化。
本公开提出一种图像中运动对象的确定装置,在确定当前帧视频图像中运动的目标对象的目标框的过程中,确定出当前帧视频图像中目标对象的多个第一候选框,并确定当前帧视频图像的运动前景图像,以根据运动前景图像,确定出当前帧视频图像中运动的目标对象的第二候选框,并对多个第一候选框进行过滤,得到第三候选框,以根据第二候选框和第三候选框,确定出运动的目标对象的目标框。由此,结合当前帧视频图像的运动前景图像,对当前帧视频图像中目标对象的候选框的过滤,减少了大量无用的候选框,对确定当前帧视频图像中运动的目标对象所对应的目标框的影响,实现准确确定出当前帧视频图像中运动的目标对象的目标框的同时,提高了计算效率。
本公开第三方面实施例提出了一种电子设备,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,当所述处理器执行所述程序时实现本公开实施例中的图像中运动对象的确定方法。
本公开第四方面实施例提出了一种计算机可读存储介质,其上存储有计算机程序,当该程序被处理器执行时本公开实施例中的图像中运动对象的确定方法。
本公开第五方面实施例提出了一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现如本公开实施例中的图像中运动对象的确定方法。
本公开第六方面实施例提出了一种计算机程序,所述计算机程序包括计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行如本公开实施例中的图像中运动对象的确定方法。
上述可选方式所具有的其他效果将在下文中结合具体实施例加以说明。
附图说明
图1是本公开实施例所提供的一种图像中运动对象的确定方法的流程示意图;
图2是本公开实施例所提供的另一种图像中运动对象的确定方法的流程示意图;
图3是本公开实施例所提供的另一种图像中运动对象的确定方法的流程示意图;
图4是本公开实施例所提供一种运动前景图像的示例图;
图5是本公开实施例所提供一种图像中运动对象的确定装置的结构示意图;
图6是本公开实施例所提供另一种图像中运动对象的确定装置的结构示意图;
图7是本公开一个实施例的电子设备的框图。
具体实施方式
下面详细描述本公开的实施例,实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本公开,而不能理解为对本公开的限制。
下面参考附图描述本公开实施例的图像中运动对象的确定方法、装置和电子设备。
图1是本公开实施例所提供的一种图像中运动对象的确定方法的流程示意图。其中,需要说明的是,本实施例提供的图像中运动对象的确定方法的执行主体为图像中运动对象的确定装置,该图像中运动对象的确定装置可以由软件和/或硬件的方式实现,该实施例中的图像中运动对象的确定装置可以配置电子设备中,本实施例中的电子设备可以包括服务器,该实施例对电子设备不作具体限定。
图1是本公开实施例所提供的一种图像中运动对象的确定方法的流程示意图。
如图1所示,该图像中运动对象的确定方法可以包括:步骤101-105。
步骤101,确定当前帧视频图像中目标对象的多个第一候选框,其中,第一候选框用于标注目标对象在当前帧图像中所占用的图像区域。
在一些实施例中,在物流场景下进行“暴力分拣”识别时,为了检测是否处于“暴力分 拣”,可对分拣包裹的过程进行视频采集,并对采集到的视频中的每一帧视频图像,采用本实施例提出的方法进行处理。其中,需要说明的是,上述目标对象可以为包裹对象。
在一些实施例中,上述确定当前帧视频图像中目标对象的多个第一候选框可以通过多种方式实现,示例性说明如下:
根据本公开的一种实施方式,可将当前帧视频图像输入到预先训练的神经网络模型中,以通过神经网络模型得到当前帧视频图像中目标对象的多个第一候选框。
根据本公开的另一种的实施方式,可基于目标对象的特征,对当前帧视频图像进行处理,以获取与目标对象的特征匹配的区域,并根据该区域,确定目标对象的候选框。
步骤102,确定当前帧视频图像的运动前景图像。
其中,确定当前帧视频图像的运动前景图像可以通过多种方式实现,例如帧差法、背景建模法、光流法,但不仅限于此。
步骤103,根据运动前景图像,确定当前帧视频图像中运动的目标对象的第二候选框。
在一些实施例中,在暴力分拣时,当前包裹处于运动包裹时,由于包裹处于高速运动状态,会导致运动模糊,为了避免基于一帧视频图像无法准确检测到运动的目标对象的候选框,在本实施例中可以获取到包裹的当前帧视频图像的运动前景图像以及当前帧的视频图像,并根据图像的变化信息,进行分割,再通过分割后的区域计算结果,以确定出当前帧视频图像中运动的目标对象的第二候选框,从而基于运动前景图像,准确确定出当前帧视频图像中运动的目标对象的第二候选框。
步骤104,根据运动前景图像,对多个第一候选框进行过滤,以得到第三候选框,其中,第三候选框中运动前景像素点的面积大于第一预设阈值。
在一些实施例中,根据运动前景图像,对多个第一候选框进行过滤,以得到第三候选框的一种实施方式为,针对每个第一候选框,根据运动前景图像,确定该第一候选框内部像素点为1的面积,在该面积小于设定第一预设阈值的情况下,则判断出该第一候选框内没有运动的目标对象,并过滤掉该第一候选框,以此对多个第一候选框进行过滤,以得到过滤后的第三候选框,从而在对包裹暴力分拣的检测过程中,有效的减少大量静态包裹的无用计算,节约计算资源。
其中,该第一预设阈值可以是结合实际场景设定的,该实施例对此不做具体限定。
步骤105,根据第二候选框和第三候选框,确定出运动的目标对象的目标框。
在一些实施例中,将第二候选框与第三候选框进行融合,并根据第二候选框和第三候选框的重合面积,以确定出运动的目标对象的目标框,即通过将静态图像的检测结果和动态的运动信息相结合,过滤掉对于“暴力分拣”识别的大量无用的静止物体,有效减少了 计算量,提高了计算效率。
本公开提出一种图像中运动对象的确定方法,在确定当前帧视频图像中运动的目标对象的目标框的过程中,确定出当前帧视频图像中目标对象的多个第一候选框,并确定当前帧视频图像的运动前景图像,以根据运动前景图像,确定出当前帧视频图像中运动的目标对象的第二候选框,并对多个第一候选框进行过滤,得到第三候选框,以根据第二候选框和第三候选框,确定出运动的目标对象的目标框。由此,结合当前帧视频图像的运动前景图像,对当前帧视频图像中目标对象的候选框的过滤,减少了大量无用的候选框,对确定当前帧视频图像中运动的目标对象所对应的目标框的影响,实现准确确定出当前帧视频图像中运动的目标对象的目标框的同时,提高了计算效率。
图2是本公开实施例所提供的另一种图像中运动对象的确定方法的流程示意图。如图2所示,该图像中运动对象的确定方法包括步骤201-206。
步骤201,确定当前帧视频图像中目标对象的多个第一候选框,其中,第一候选框用于标注目标对象在当前帧图像中所占用的图像区域。
其中,需要说明的是,关于步骤201的具体实现方式,可参见上述实施例中的相关描述。
步骤202,确定当前帧视频图像的运动前景图像。
在一些实施例中,确定当前帧视频图像的运动前景图像,的一种实施方式为,获取当前帧视频图像的上一帧视频图像,并根据上一帧视频图像,确定当前帧视频图像的背景图像,在根据背景图像对当前帧视频图像进行前景识别,以得到当前帧视频图像的运动前景图像,其中,运动前景图像用于表征当前帧视频图像相对于背景图像的变化,从而有效、准确的从静止图像中提取出作为前景的运动物体。
步骤203,根据运动前景图像,确定当前帧视频图像中运动的目标对象的第二候选框。
在一些实施例中,运动前景图像为多个,根据运动前景图像,确定当前帧视频图像中运动的目标对象的第二候选框的一种实施方式为,确定每个运动前景图像的连通区域,并计算每个运动前景图像中连通区域的总面积,并确定每个运动前景图像的最大外接矩形框,以从多个运动前景图像中,获取连通区域的总面积与最大外接矩形框的面积之比大于第三预设阈值的目标运动前景图像,在将目标运动前景图像所对应的最大外接矩形框,确定为运动的目标对象的第二候选框,从而基于对运动前景图像进行连通计算,以准确确定到当前帧视频图像中运动的目标对象的第二候选框。
步骤204,根据运动前景图像,对多个第一候选框进行过滤,以得到第三候选框,其中,第三候选框中运动前景像素点的面积大于第一预设阈值。
在一些实施例中,可以根据运动前景图像,确定出每个第一候选框中运动前景像素点的面积,并将该第一候选框中运动前景像素点的面积与该目标对象对应的第一预设阈值进行比较,在该第一候选框中运动前景像素点的面积大于第一预设阈值的情况下,则将该第一候选框作为第三候选框。
其中,可以理解的是,根据运动前景图像,确定每个第一候选框中运动前景像素点的面积,具体方式可以为:通过获取第一候选框在当前帧视频图像中的位置信息,从运动前景图像中确定位于该位置信息内的运动前景像素的面积,以得到该第一候选框中运动前景像素点的面积。
步骤205,确定第二候选框和第三候选框之间的重叠区域。
步骤206,在重叠区域的面积大于第二预设阈值的情况下,将第二候选框或者第三候选框作为运动的目标对象的目标框。
在一些实施例中,在重叠区域的面积小于或者等于第二预设阈值的情况下,将第二候选框和第三候选框均作为运动的目标对象的目标框,从而根据第二候选框和第三候选框之间的重叠区域,准确确定出目标对象目标框。
本公开提出一种图像中运动对象的确定方法,确定出当前帧视频图像中目标对象的多个第一候选框,并确定当前帧视频图像的运动前景图像,以根据运动前景图像,确定出当前帧视频图像中运动的目标对象的第二候选框,并对多个第一候选框进行过滤,得到第三候选框,在第二候选框和第三候选框之间的重叠区域的面积大于第二预设阈值的情况下,将第二候选框或者第三候选框作为运动的目标对象的目标框。由此,在检测目标对象运动的过程中,基于静态图像的筛选和动态的运动信息,以准确确定出目标对象的目标框。
图3是本公开实施例所提供的另一种图像中运动对象的确定方法的流程示意图。如图3所示,该图像中运动对象的确定方法包括:步骤301-306。
步骤301,获取视频图像序列。
在一些实施例中,可以将输入的视频流或本地视频文件,进行视频解码抽帧,以提取为图像序列,作为目标对象检测的输入。
步骤302,单张静态图片的物体检测。
在一些实施例中,获取上述采集的图像序列中的每一帧静态图片,进行目标检测,以提取出每一帧图像中包裹的目标框B model(t),可以理解的是,提取出每一帧图像中包裹的目标框的方式可以包括但不仅限于目标检测和深度学习神经网络模型。
其中,B model(t)表示由模型预测的静态图片的目标框结果,t为图像序列的帧索引,每一 项元素为目标框的左上角和右下角的坐标(x1,y1,x2,y2)。
步骤303,运动前景区域分割。
在一些实施例中,运动前景区域分割可以采用但不限于采用帧差法、背景建模法、光流法,该实施例对此不做具体限定。
其中,帧差法以及光流法都是以当前帧的图片和上一帧图片作为输入,计算得到图像的变化信息,然后根据一个对应环境的经验阈值,对帧差法或光流法的结果进行图像二值分割,以得到运动的前景图像,如图4所示,可以理解的是,图4左侧为某一帧静态彩色(red-green-blue,RGB)图像,图4右侧除去虚线框部分为运动前景分割结果,灰色区域为运动前景图像F(x,y,t)。
其中,可以理解的是,运动前景分割是指在序列特征多种运动中,标记出与每一独立运动相关联的像素,并对这些像素按照各自所属的对象进行聚类,其主要目的是从静止背景之中提取出作为前景的运动物体。
步骤304,生成额外的运动物体目标框。
在一些实施例中,根据步骤303得到的运动前景分割结果,对于图4右侧灰色的运动前景图像,进行连通区域计算,并对每个连通区域的最外侧边缘求取最大外接矩形框,如图4右侧的矩形框c、矩形框d,由于包裹本身的高速运动特性,所以在运动前景分割结果图上,体现出包裹运动变化区域经常呈现出一片内部很少空洞的连通区域。此外,在物流场景中,大多数包裹都为矩形体状,因此,可以通过计算连通区域的总面积S roi和外侧最大外接框面积S rec之比ratio=S rio/S rec,作为生成额外运动包裹目标框的约束,当ratio大于某个阈值时,可以认为该候选框为运动的包裹目标,如图4右侧候选框c为真正运动的包裹,而候选框d是由人手臂局部运动引起。
此外,还可以过滤掉连通区域的总面积小于某个阈值的候选框,以滤除噪点,提高检测结果的准确性。
步骤305,结果融合。
在一些实施例中,先根据步骤303得到的运动分割前景图像,将静态图片的目标检测结果中的静止目标对象过滤掉,其中,步骤303中的得到的运动分割前景图像为二值图像,运动前景区域像素为1,背景像素为0。
基于每一帧静态图片的包裹检测中得到的第一候选框,并计算内部像素为1的面积,如果小于某个阈值,可以认为该第一候选框内没有运动目标,以此过滤掉大量静止的第一候选框,然后对过滤后的第一候选框和生成额外的运动物体目标框进行融合,对于互相重合区域面积大于某一个阈值的两个框,随机删除其中一个,只保留其中一个作为最终的目 标候选框结果,对于互相重合区域面积小于或等于某一个阈值的两个框,则将两个都作为最终的目标候选框。
步骤306,输出运动物体检测结果。
本公开提出一种图像中运动对象的确定方法,通过将获取的视频的图像序列作为输入,以对每一帧静态图片进行包裹检测,并进行运动前景区域分割,从而生成额外的运动包裹目标框,再将运动包裹目标框和静态目标检测结果的检测框进行融合,以得到目标候选框,并输出运动物体检测结果。由此,在检测目标对象运动的过程中,将静态图像的检测结果和动态的运动信息相融合,以准确得到运动物体检测结果并输出。
图5是本公开实施例所提供一种图像中运动对象的确定装置的结构示意图。
如图5所示,该图像中运动对象的确定装置500包括:501-505。
第一确定模块501,用于确定当前帧视频图像中目标对象的多个第一候选框,其中,第一候选框用于标注目标对象在当前帧图像中所占用的图像区域。
第二确定模块502,用于确定当前帧视频图像的运动前景图像。
第三确定模块503,用于根据运动前景图像,确定当前帧视频图像中运动的目标对象的第二候选框。
过滤模块504,用于根据运动前景图像,对多个第一候选框进行过滤,以得到第三候选框,其中,第三候选框中运动前景像素点的面积大于第一预设阈值。
第四确定模块505,用于根据第二候选框和第三候选框,确定出运动的目标对象的目标框。
本公开提出一种图像中运动对象的确定装置,在确定当前帧视频图像中运动的目标对象的目标框的过程中,确定出当前帧视频图像中目标对象的多个第一候选框,并确定当前帧视频图像的运动前景图像,以根据运动前景图像,确定出当前帧视频图像中运动的目标对象的第二候选框,并对多个第一候选框进行过滤,得到第三候选框,以根据第二候选框和第三候选框,确定出运动的目标对象的目标框。由此,结合当前帧视频图像的运动前景图像,对当前帧视频图像中目标对象的候选框的过滤,减少了大量无用的候选框,对确定当前帧视频图像中运动的目标对象所对应的目标框的影响,实现准确确定出当前帧视频图像中运动的目标对象的目标框的同时,提高了计算效率。
在本公开的一个实施例中,如图6,第四确定模块505,包括:
确定单元5051,用于确定第二候选框和第三候选框之间的重叠区域。
第一生成单元5052,用于在重叠区域的面积大于第二预设阈值的情况下,将第二候选框或者第三候选框作为运动的目标对象的目标框。
在本公开的一个实施例中,如图6,第四确定模块505,还包括:
第二生成单元5053,用于在重叠区域的面积小于或者等于第二预设阈值的情况下,将第二候选框和第三候选框均作为运动的目标对象的目标框。
在本公开的一个实施例中,如图6,第二确定模块502,具体用于:
确定每个运动前景图像的连通区域,并计算每个运动前景图像中连通区域的总面积。
确定每个运动前景图像的最大外接矩形框框。
从多个运动前景图像中,获取连通区域的总面积与最大外接矩形框的面积之比大于第三预设阈值的目标运动前景图像。
将目标运动前景图像所对应的最大外接矩形框,确定为运动的目标对象的第二候选框。
在本公开的一个实施例中,如图6,第一确定模块501,具体用于:
获取当前帧视频图像的上一帧视频图像。
根据上一帧视频图像,确定当前帧视频图像的背景图像。
根据背景图像对当前帧视频图像进行前景识别,以得到当前帧视频图像的运动前景图像,其中,运动前景图像用于表征当前帧视频图像相对于背景图像的变化。
本公开提出一种图像中运动对象的确定方法,在确定当前帧视频图像中运动的目标对象的目标框的过程中,确定出当前帧视频图像中目标对象的多个第一候选框,并确定当前帧视频图像的运动前景图像,以根据运动前景图像,确定出当前帧视频图像中运动的目标对象的第二候选框,并对多个第一候选框进行过滤,得到第三候选框,以根据第二候选框和第三候选框,确定出运动的目标对象的目标框。由此,结合当前帧视频图像的运动前景图像,对当前帧视频图像中目标对象的候选框的过滤,减少了大量无用的候选框,对确定当前帧视频图像中运动的目标对象所对应的目标框的影响,实现准确确定出当前帧视频图像中运动的目标对象的目标框的同时,提高了计算效率。
如图7所示,是根据本公开一个实施例的电子设备的框图。
如图7所示,该电子设备包括:
存储器701、处理器702及存储在存储器701上并可在处理器702上运行的计算机指令。
处理器702执行指令时实现上述实施例中提供的图像中运动对象的确定方法。
进一步地,电子设备还包括:
通信接口703,用于存储器701和处理器702之间的通信。
存储器701,用于存放可在处理器702上运行的计算机指令。
存储器701可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile  memory),例如至少一个磁盘存储器。
处理器702,用于执行程序时实现上述实施例的图像中运动对象的确定方法。
如果存储器701、处理器702和通信接口703独立实现,则通信接口703、存储器701和处理器702可以通过总线相互连接并完成相互间的通信。总线可以是工业标准体系结构(Industry Standard Architecture,简称为ISA)总线、外部设备互连(Peripheral Component,简称为PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,简称为EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图7中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在一些实施例中,在具体实现上,如果存储器701、处理器702及通信接口703,集成在一块芯片上实现,则存储器701、处理器702及通信接口703可以通过内部接口完成相互间的通信。
处理器702可能是一个中央处理器(Central Processing Unit,简称为CPU),或者是特定集成电路(Application Specific Integrated Circuit,简称为ASIC),或者是被配置成实施本公开实施例的一个或多个集成电路。
本公开实施例还提出一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本公开前述实施例提出的图像中运动对象的确定方法。在一些实施例中,所述计算机可读存储介质是非瞬时性计算机可读存储介质。
本公开实施例还提出一种计算机程序产品,包括计算机程序,该计算机程序在被处理器执行时实现如本公开前述实施例提出的图像中运动对象的确定方法。
本公开实施例还提出一种计算机程序,其中该计算机程序包括计算机程序代码,当该计算机程序代码在计算机上运行时,使得计算机执行如本公开前述实施例提出的图像中运动对象的确定方法。
需要说明的是,前述对图像中运动对象的确定方法实施例的解释说明也适用于上述实施例中的电子设备、非瞬时计算机可读存储介质、计算机程序产品和计算机程序,此处不再赘述。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本公开的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点 包含于本公开的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
尽管上面已经示出和描述了本公开的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本公开的限制,本领域的普通技术人员在本公开的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (14)

  1. 一种图像中运动对象的确定方法,其特征在于,所述方法包括:
    确定当前帧视频图像中目标对象的多个第一候选框,其中,所述第一候选框用于标注所述目标对象在所述当前帧图像中所占用的图像区域;
    确定所述当前帧视频图像的运动前景图像;
    根据所述运动前景图像,确定所述当前帧视频图像中运动的目标对象的第二候选框;
    根据所述运动前景图像,对多个所述第一候选框进行过滤,以得到第三候选框,其中,所述第三候选框中运动前景像素点的面积大于第一预设阈值;
    根据所述第二候选框和所述第三候选框,确定出所述运动的目标对象的目标框。
  2. 如权利要求1所述的方法,其特征在于,所述根据所述第二候选框和所述第三候选框,确定出所述运动的目标对象的目标框,包括:
    确定所述第二候选框和所述第三候选框之间的重叠区域;
    在所述重叠区域的面积大于第二预设阈值的情况下,将所述第二候选框或者所述第三候选框作为所述运动的目标对象的目标框。
  3. 如权利要求2所述的方法,其特征在于,所述方法还包括:
    在所述重叠区域的面积小于或者等于第二预设阈值的情况下,将所述第二候选框和所述第三候选框均作为所述运动的目标对象的目标框。
  4. 如权利要求1至3中任一项所述的方法,其特征在于,所述运动前景图像为多个,所述根据所述运动前景图像,确定所述当前帧视频图像中运动的目标对象的第二候选框,包括:
    确定每个所述运动前景图像的连通区域,并计算每个所述运动前景图像中连通区域的总面积;
    确定每个所述运动前景图像的最大外接矩形框;
    从多个所述运动前景图像中,获取连通区域的总面积与最大外接矩形框的面积之比大于第三预设阈值的目标运动前景图像;
    将所述目标运动前景图像所对应的最大外接矩形框,确定为所述运动的目标对象的第二候选框。
  5. 如权利要求1至4中任一项所述的方法,其特征在于,所述确定所述当前帧视频图像的运动前景图像,包括:
    获取所述当前帧视频图像的上一帧视频图像;
    根据所述上一帧视频图像,确定所述当前帧视频图像的背景图像;
    根据所述背景图像对所述当前帧视频图像进行前景识别,以得到所述当前帧视频图像的运动前景图像,其中,所述运动前景图像用于表征所述当前帧视频图像相对于所述背景图像的变化。
  6. 一种图像中运动对象的确定装置,其特征在于,所述装置包括:
    第一确定模块,用于确定当前帧视频图像中目标对象的多个第一候选框,其中,所述第一候选框用于标注所述目标对象在所述当前帧图像中所占用的图像区域;
    第二确定模块,用于确定所述当前帧视频图像的运动前景图像;
    第三确定模块,用于根据所述运动前景图像,确定所述当前帧视频图像中运动的目标对象的第二候选框;
    过滤模块,用于根据所述运动前景图像,对多个所述第一候选框进行过滤,以得到第三候选框,其中,所述第三候选框中运动前景像素点的面积大于第一预设阈值;
    第四确定模块,用于根据所述第二候选框和所述第三候选框,确定出所述运动的目标对象的目标框。
  7. 如权利要求6所述的装置,其特征在于,所述第四确定模块,包括:
    确定单元,用于确定所述第二候选框和所述第三候选框之间的重叠区域;
    第一生成单元,用于在所述重叠区域的面积大于第二预设阈值的情况下,将所述第二候选框或者所述第三候选框作为所述运动的目标对象的目标框。
  8. 如权利要求7所述的装置,其特征在于,所述第四确定模块还包括:
    第二生成单元,用于在所述重叠区域的面积小于或者等于第二预设阈值的情况下,将所述第二候选框和所述第三候选框均作为所述运动的目标对象的目标框。
  9. 如权利要求6至8中任一项所述的装置,其特征在于,所述运动前景图像为多个,所述第二确定模块,具体用于:
    确定每个所述运动前景图像的连通区域,并计算每个所述运动前景图像中连通区域的总面积;
    确定每个所述运动前景图像的最大外接矩形框;
    从多个所述运动前景图像中,获取连通区域的总面积与最大外接矩形框的面积之比大于第三预设阈值的目标运动前景图像;
    将所述目标运动前景图像所对应的最大外接矩形框,确定为所述运动的目标对象的第二候选框。
  10. 如权利要求6至9中任一项所述的装置,其特征在于,所述第一确定模块,具体 用于:
    获取所述当前帧视频图像的上一帧视频图像;
    根据所述上一帧视频图像,确定所述当前帧视频图像的背景图像;
    根据所述背景图像对所述当前帧视频图像进行前景识别,以得到所述当前帧视频图像的运动前景图像,其中,所述运动前景图像用于表征所述当前帧视频图像相对于所述背景图像的变化。
  11. 一种电子设备,其特征在于,包括:
    存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现如权利要求1至5中任一所述的图像中运动对象的确定方法。
  12. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1至5中任一所述的图像中运动对象的确定方法。
  13. 一种计算机程序产品,包括计算机程序,其中所述计算机程序在被处理器执行时实现如权利要求1至5中任一项所述的图像中运动对象的确定方法。
  14. 一种计算机程序,其特征在于,所述计算机程序包括计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行如权利要求1至5中任一项所述的图像中运动对象的确定方法。
PCT/CN2022/134488 2022-02-25 2022-11-25 图像中运动对象的确定方法、装置、电子设备和存储介质 WO2023160061A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210179846.0 2022-02-25
CN202210179846.0A CN114550062A (zh) 2022-02-25 2022-02-25 图像中运动对象的确定方法、装置、电子设备和存储介质

Publications (1)

Publication Number Publication Date
WO2023160061A1 true WO2023160061A1 (zh) 2023-08-31

Family

ID=81678660

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/134488 WO2023160061A1 (zh) 2022-02-25 2022-11-25 图像中运动对象的确定方法、装置、电子设备和存储介质

Country Status (2)

Country Link
CN (1) CN114550062A (zh)
WO (1) WO2023160061A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114550062A (zh) * 2022-02-25 2022-05-27 京东科技信息技术有限公司 图像中运动对象的确定方法、装置、电子设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108647649A (zh) * 2018-05-14 2018-10-12 中国科学技术大学 一种视频中异常行为的检测方法
CN110363790A (zh) * 2018-04-11 2019-10-22 北京京东尚科信息技术有限公司 目标追踪方法、装置和计算机可读存储介质
CN113807185A (zh) * 2021-08-18 2021-12-17 苏州涟漪信息科技有限公司 一种数据处理方法和装置
CN114550062A (zh) * 2022-02-25 2022-05-27 京东科技信息技术有限公司 图像中运动对象的确定方法、装置、电子设备和存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106600625A (zh) * 2016-12-13 2017-04-26 广东沅朋网络科技有限公司 检测小型生物的图像处理方法及装置
CN110310301B (zh) * 2018-03-27 2021-07-16 华为技术有限公司 一种检测目标对象的方法及装置
CN110879951B (zh) * 2018-09-06 2022-10-25 华为技术有限公司 一种运动前景检测方法及装置
CN110798592B (zh) * 2019-10-29 2022-01-04 普联技术有限公司 基于视频图像的物体移动侦测方法、装置、设备及存储介质
CN112749590B (zh) * 2019-10-30 2023-02-07 上海高德威智能交通***有限公司 目标检测方法、装置、计算机设备和计算机可读存储介质
CN112347967B (zh) * 2020-11-18 2023-04-07 北京理工大学 一种复杂场景下融合运动信息的行人检测方法
CN113822110B (zh) * 2021-01-07 2023-08-08 北京京东振世信息技术有限公司 一种目标检测的方法和装置
CN113192057A (zh) * 2021-05-21 2021-07-30 上海西井信息科技有限公司 目标检测方法、***、设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110363790A (zh) * 2018-04-11 2019-10-22 北京京东尚科信息技术有限公司 目标追踪方法、装置和计算机可读存储介质
CN108647649A (zh) * 2018-05-14 2018-10-12 中国科学技术大学 一种视频中异常行为的检测方法
CN113807185A (zh) * 2021-08-18 2021-12-17 苏州涟漪信息科技有限公司 一种数据处理方法和装置
CN114550062A (zh) * 2022-02-25 2022-05-27 京东科技信息技术有限公司 图像中运动对象的确定方法、装置、电子设备和存储介质

Also Published As

Publication number Publication date
CN114550062A (zh) 2022-05-27

Similar Documents

Publication Publication Date Title
CN109325954B (zh) 图像分割方法、装置及电子设备
CN111160379B (zh) 图像检测模型的训练方法及装置、目标检测方法及装置
JP6445775B2 (ja) 画像処理装置、画像処理方法
CN108875723B (zh) 对象检测方法、装置和***及存储介质
KR101896357B1 (ko) 객체를 검출하는 방법, 디바이스 및 프로그램
CN107944403B (zh) 一种图像中的行人属性检测方法及装置
CN109727275B (zh) 目标检测方法、装置、***和计算机可读存储介质
CN112101386B (zh) 文本检测方法、装置、计算机设备和存储介质
CN111626163B (zh) 一种人脸活体检测方法、装置及计算机设备
CN109816694B (zh) 目标跟踪方法、装置及电子设备
JP6642970B2 (ja) 注目領域検出装置、注目領域検出方法及びプログラム
CN105894464A (zh) 一种中值滤波图像处理方法和装置
CN112926531A (zh) 特征信息提取方法、模型训练方法、装置及电子设备
CN113449606B (zh) 一种目标对象识别方法、装置、计算机设备及存储介质
CN111553302B (zh) 关键帧选取方法、装置、设备及计算机可读存储介质
CN110570442A (zh) 一种复杂背景下轮廓检测方法、终端设备及存储介质
CN109447022B (zh) 一种镜头类型识别方法及装置
WO2023160061A1 (zh) 图像中运动对象的确定方法、装置、电子设备和存储介质
CN114067186B (zh) 一种行人检测方法、装置、电子设备及存储介质
WO2022206679A1 (zh) 图像处理方法、装置、计算机设备和存储介质
EP4332910A1 (en) Behavior detection method, electronic device, and computer readable storage medium
CN113762027B (zh) 一种异常行为的识别方法、装置、设备及存储介质
CN114399657A (zh) 车辆检测模型训练方法、装置、车辆检测方法及电子设备
CN114898306A (zh) 一种检测目标朝向的方法、装置及电子设备
CN115170612A (zh) 一种检测跟踪方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22928321

Country of ref document: EP

Kind code of ref document: A1