WO2022135511A1 - 运动物体的定位方法、装置、电子设备及存储介质 - Google Patents

运动物体的定位方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2022135511A1
WO2022135511A1 PCT/CN2021/140765 CN2021140765W WO2022135511A1 WO 2022135511 A1 WO2022135511 A1 WO 2022135511A1 CN 2021140765 W CN2021140765 W CN 2021140765W WO 2022135511 A1 WO2022135511 A1 WO 2022135511A1
Authority
WO
WIPO (PCT)
Prior art keywords
event
frame
area
moving object
sampling
Prior art date
Application number
PCT/CN2021/140765
Other languages
English (en)
French (fr)
Inventor
马欣
吴臻志
祝夭龙
Original Assignee
北京灵汐科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京灵汐科技有限公司 filed Critical 北京灵汐科技有限公司
Publication of WO2022135511A1 publication Critical patent/WO2022135511A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Definitions

  • the embodiments of the present disclosure relate to the technical field of image recognition, and in particular, to a method, apparatus, electronic device, and storage medium for positioning a moving object.
  • feature extraction is usually performed directly in the global image for the video image obtained by the camera assembly, and moving objects are located in the image according to the extracted image features.
  • Embodiments of the present disclosure provide a method, device, electronic device, and storage medium for positioning a moving object.
  • an embodiment of the present disclosure provides a method for locating a moving object, the positioning method includes: acquiring event stream information through a dynamic vision sensor, and acquiring image information through a target camera component; Sampling is performed to obtain a sampling event frame; according to the event stream information corresponding to the sampling event frame, the predicted position area of the moving object in the sampling event frame is determined; The location area that matches the predicted location area.
  • an embodiment of the present disclosure provides a positioning device for a moving object, the positioning device includes: an information acquisition module for acquiring event stream information through a dynamic vision sensor, and acquiring image information through a target camera assembly; a sampling execution module , for sampling the event stream information according to a preset sampling period to obtain a sampling event frame, and determining the predicted location area of a moving object in the sampling event frame according to the event stream information corresponding to the sampling event frame ; a classification execution module, configured to determine, according to the predicted position region, a positioning region in the image information that matches the predicted position region.
  • embodiments of the present disclosure provide an electronic device, the electronic device comprising: one or more processors; a memory for storing one or more programs; when the one or more programs are stored by the one or more programs or multiple processors execute, so that the one or more processors implement the method for locating a moving object described in any embodiment of the present disclosure.
  • an embodiment of the present disclosure further provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the method for locating a moving object described in any embodiment of the present disclosure.
  • the predicted position area of the moving object in the sampling event frame is determined,
  • the matching positioning area is determined in the image information of the target camera assembly, which improves the positioning efficiency of moving objects, especially the real-time detection of high-speed moving objects.
  • FIG. 1A is a schematic flowchart of a method for locating a moving object according to an embodiment of the present disclosure
  • FIG. 1B is a schematic flowchart of a method for determining a contour region of a moving object in a sampling event frame according to an embodiment of the present disclosure
  • 1C is a schematic diagram of a predicted position area of a moving object provided by an embodiment of the present disclosure
  • 1D is a schematic flowchart of a method for determining a location area in image information that matches a predicted location area in an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of another method for locating a moving object provided by an embodiment of the present disclosure
  • FIG. 3 is a structural block diagram of a device for positioning a moving object provided by an embodiment of the present disclosure
  • FIG. 4 is a structural block diagram of an electronic device provided by an embodiment of the present disclosure.
  • FIG. 1A is a schematic flowchart of a method for locating a moving object according to an embodiment of the present disclosure.
  • the embodiment of the present disclosure can be used to detect whether there is a moving object in the image information captured by the target camera assembly, and to detect whether there is a moving object in the image information captured by the target camera assembly. Locating, identifying and classifying moving objects, the method can be performed by the device for positioning moving objects in the embodiments of the present disclosure, the device can be implemented by software and/or hardware, and integrated in electronic equipment, the method specifically includes the following steps : Steps S110 to S140.
  • step S110 event stream information is acquired through a dynamic vision sensor, and image information is acquired through a target camera assembly.
  • Dynamic Vision Sensor is an image acquisition device that adopts pixel asynchronous mechanism and is based on address and event expression (AER).
  • AER address and event expression
  • DVS does not need to read all the pixels in the picture, but only needs to obtain the address and information of the pixels whose light intensity changes;
  • the sensor detects that the light intensity change of a certain pixel is greater than or equal to the preset threshold value, it will send out the event signal of the pixel; wherein, if the light intensity change is a positive change, that is, the pixel jumps from low brightness to high If the light intensity changes in a negative direction, that is, the pixel jumps from high brightness to low brightness, it sends out a “-1” signal. If the light intensity change is less than the preset threshold value, no event signal will be sent, and it will be marked as no event; the dynamic vision sensor uses the event marking of each pixel to form event flow information .
  • the target camera assembly is a shooting device that converts optical image signals into electrical signals, and then stores or transmits electrical signals, and can include various types of shooting devices, such as high-speed image acquisition (High-speed Image Acquisition) equipment and surveillance cameras; wherein , a high-speed image acquisition device is an image acquisition device used for high-speed acquisition and acquisition of digital video image information, which can transmit, display and store the acquired image data stream according to a pre-arranged path; in the embodiments of the present disclosure , high-speed image acquisition equipment, which can quickly capture RGB (red, green, and blue three-channel) images in the visible light range, and generate high-speed picture frames to ensure the acquisition of high-speed moving object trajectories.
  • the frame rate of the generated picture frames can be On the order of one thousand to one hundred thousand frames per second.
  • the event flow information of the target scene is acquired by the dynamic vision sensor, and the image information of the target scene is acquired by the target camera component.
  • the event stream information and the image information are shots of the same scene, and the content of the shot images is the same; the event stream information and the image information can be acquired at the same time, or the event stream information can be acquired first through the dynamic vision sensor, And after positioning in the sampling event frame, the image information is obtained through the target camera component.
  • the dynamic vision sensor and the target camera assembly can be placed in adjacent shooting positions (for example, the dynamic vision sensor and the target camera assembly can be integrated in the same electronic device), In order to make the cameras of the two devices, the dynamic vision sensor and the target camera assembly, close enough to improve the shooting angle parallax, and the shooting angles of the cameras of the two devices can be adjusted to ensure that the shooting images of the same scene can be obtained.
  • Step S120 Sampling the event stream information according to a preset sampling period to obtain a sampling event frame.
  • Step S130 Determine the predicted position area of the moving object in the sampling event frame according to the event stream information corresponding to the sampling event frame.
  • the light intensity of the corresponding pixels in the area where the moving object passes in the picture will change to different degrees.
  • the light intensity will increase significantly.
  • the light intensity of the pixels in the disappearing area of the moving object will be significantly reduced. Therefore, according to the event stream information, it can be determined which pixels in the picture may have moving objects.
  • the pixel may be a pixel related to a moving object; the sampling event frame is in the preset sampling period
  • the event stream information of the sampling event frame includes event information corresponding to multiple pixels, and the event information corresponding to each pixel includes at least one annotation event,
  • the location area of the moving object can be predicted.
  • the preset sampling period can be set according to actual needs. For example, in order to improve the detection efficiency of moving objects in the event stream information, the preset sampling period can be set to a lower value; in order to reduce the image processing pressure, the preset sampling period can be set to a lower value. Set the sampling period to a high value; in particular, due to the high detection accuracy of DVS, the detection of event signals at pixel points can reach nanosecond level (for example, 1000 nanoseconds, that is, pixels are acquired every 1000 nanoseconds).
  • the preset sampling period is usually set to millisecond level (for example, 10 milliseconds), therefore, in one sampling period, the light intensity of one pixel point may experience multiple changes, that is, the DVS for one pixel If the point sends out multiple event signals, as long as the event information of the pixel point includes at least one positive event and/or a negative event within the preset sampling period, the pixel point is included in the predicted position area of the moving object.
  • step S130 determining the predicted location area of the moving object in the sampling event frame according to the event flow information corresponding to the sampling event frame, may further include: according to the event flow information corresponding to the sampling event frame, Determine the contour area of the moving object in the sampling event frame, and mark the contour area through the area of interest frame to obtain the predicted position area of the moving object.
  • Region Of Interest is a box, circle, ellipse, and polygon that outlines the area that needs to be processed. Since the contour information of moving objects is usually irregular graphics, it is not easy to locate in the image.
  • the area of interest frame can be marked in the image by means of a rectangular frame to mark the smallest rectangle that includes both the appearance of the moving object and the disappearance of the moving object.
  • the area of is the predicted position area of the moving object.
  • the contour region of the moving object it can be obtained through the target detection algorithm in the sampling event frame, for example, through the sliding window detector or R-CNN (Regions with CNN features, region features based on convolutional neural network).
  • FIG. 1B is a schematic flowchart of a method for determining a contour region of a moving object in a sampling event frame according to an embodiment of the present disclosure.
  • step S130 according to the sampling event frame
  • the corresponding event flow information to determine the contour region of the moving object in the sampling event frame may further include: steps S131 to S133.
  • Step S131 Acquire an event occurrence frame and an event disappearance frame according to the event stream information corresponding to the sampled event frame.
  • the event flow information of the sampling event frame includes event information corresponding to a plurality of pixels in the sampling event frame, the event information corresponding to each pixel includes at least one labeling event, and the labeling event includes a positive event or a negative event. event.
  • step S131 acquiring the event occurrence frame and the event disappearance frame according to the event flow information corresponding to the sampled event frame, may further include: marking the event information corresponding to a plurality of pixel points The pixel corresponding to the labeling event of the positive event is determined as the event occurrence pixel; and in the event information corresponding to the multiple pixel points, the pixel corresponding to the labeling event marked as a negative event is determined as the event disappearance pixel; An event-on frame is generated from all event-on pixels, and an event-off frame is generated from all event-off pixels.
  • sampling event frame describes the event information of all pixels
  • event appearance frame describes the information of the pixels corresponding to all positive events
  • event disappearance frame describes the information of the pixels corresponding to all negative events.
  • the pixel resolution of the event appearance frame, the event disappearance frame and the sampling event frame is the same, and the pixel resolution of the sampling event frame is the same as that of the dynamic vision sensor DVS.
  • the pixel values corresponding to all event occurrence pixels are set as the first pixel value
  • the pixel values corresponding to all non-event occurrence pixels are set as the second pixel value.
  • the pixel values corresponding to all event disappearing pixels are set to the first pixel value
  • the pixel values corresponding to all non-event disappearing pixels are set to the second pixel value.
  • the first pixel value may be set to the maximum pixel value, that is, 255
  • the second pixel value may be set to the minimum pixel value, that is, 0.
  • the event occurrence frame may be represented by an event occurrence matrix
  • the event disappearance frame may be represented by an event disappearance matrix.
  • each element in the event occurrence matrix corresponds to each pixel of the event occurrence frame, and the position is set correspondingly, and the value of each element in the event occurrence matrix is the pixel value of the corresponding pixel;
  • Each element corresponds to each pixel of the event disappearance frame, and the position is set correspondingly, and the value of each element in the event disappearance matrix is the pixel value of the corresponding pixel.
  • the value of each element in the event occurrence empty matrix is initialized to the second pixel value (such as 0), the number of element rows in the event occurrence empty matrix corresponds to the pixel row number in the pixel resolution of the sampling event frame, and the event occurrence
  • the number of element columns in the empty matrix corresponds to the number of pixel columns in the pixel resolution of the sampling event frame, and each element in the empty matrix for event occurrence corresponds to a pixel;
  • the value of each element in the empty matrix for event disappearance is initialized to the second
  • the pixel value (such as 0)
  • the number of element rows in the event disappearance empty matrix corresponds to the number of pixel rows in the pixel resolution of the sampling event frame
  • the number of element columns in the event disappearance empty matrix corresponds to the pixel in the pixel resolution of the sampling event frame.
  • each element in the event disappearance empty matrix corresponds to a pixel.
  • the resolution of the dynamic vision sensor is 1024 (horizontal pixels) ⁇ 648 (vertical pixels)
  • the event appearance empty matrix and the event disappearance empty matrix are both 1024 (row) ⁇ 648 (column) matrices.
  • each element in the event occurrence empty matrix and the event disappearance empty matrix is assigned to obtain the event occurrence matrix and the event disappearance matrix.
  • the values of each element in the event occurrence empty matrix and the event disappearance empty matrix are initialized to the second pixel value, for example, 0. If within the preset sampling period, when the labeling event of a pixel is obtained as a positive event, in the event occurrence empty matrix, the element corresponding to the pixel is assigned as the first pixel value (that is, the assignment is 255); By assigning the first pixel value (ie 255) to the elements in the event occurrence empty matrix corresponding to all the pixels containing positive events within the preset sampling period, and all the pixels containing negative events or no events corresponding to the event The value of the element is kept as the second pixel value (ie 0), so as to obtain the event occurrence matrix; in the obtained event occurrence matrix, the position of the element whose value is 255 in the picture indicates that the moving object appears in the preset sampling period.
  • the appearance track of the edge of the object therefore, according to the event appearance matrix, the appearance outline of the highlighted moving object can be obtained in the image.
  • the element corresponding to the pixel is assigned as the first pixel value (that is, the assignment is 255); by assigning the elements in the event-disappearance empty matrix corresponding to all the pixels containing negative events in the preset sampling period as the first pixel value (ie 255), and all the pixels containing positive events or no events.
  • the value of the corresponding element is kept as the second pixel value (ie 0), thereby obtaining the event disappearance matrix; in the obtained event disappearance matrix, the position of the element whose value is 255 in the picture indicates that motion occurs in the preset sampling period
  • the object is the disappearance track of the edge of the moving object. Therefore, according to the event disappearance matrix, the highlighted moving object disappearance contour can be obtained in the image. Finally, the union of the appearance outline of the moving object and the disappearing outline of the moving object is taken as the outline information of the moving object.
  • Step S132 Determine the predicted appearance area of the moving object according to the event appearance frame, and determine the predicted disappearance area of the moving object according to the event disappearance frame.
  • step S132 the predicted occurrence area of the moving object is determined according to the positions of all pixel points whose pixel values are the first pixel value in the event occurrence frame, and the predicted occurrence area of the moving object is the area where the appearance outline of the aforementioned moving object is located;
  • the predicted disappearance area of the moving object is determined according to the positions of all pixels whose pixel value is the first pixel value in the event disappearance frame, and the predicted disappearance area of the moving object is the area where the disappearance contour of the moving object is located.
  • the position of the pixel point can be represented by two-dimensional position coordinates.
  • Step S133 Determine the contour area of the moving object according to the predicted appearance area and the predicted disappearance area.
  • the predicted appearance area of the moving object is the area where the outline of the aforementioned moving object is located, which can be called the area of the appearance outline of the moving object;
  • the predicted disappearance area of the moving object is the area where the disappearing outline of the moving object is located, which can be It is called the vanishing contour area of the moving object.
  • FIG. 1C is a schematic diagram of a predicted position area of a moving object provided by an embodiment of the present disclosure. As shown in FIG.
  • ROI 1 [x 11 , y 11 , x 12 , y 12 ], (x 11 , y 11 ) and (x 12 , y 12 ) are the upper left corner vertexes of the contour region of the moving object, respectively
  • the two-dimensional position coordinates of A1 (the pixel point in the upper left corner of the corresponding area) and the two-dimensional position coordinates of the lower right corner vertex B1 (the pixel point in the lower right corner of the corresponding area).
  • the predicted position region ROI DVS of moving objects in the frame can be expressed by the following formula:
  • ROI DVS [min(x 11 , x 21 ), min(y 11 , y 21 ), max(x 12 , x 22 ), max(y 12 , y 22 )].
  • ROI DVS [x 11 , y 11 , x 22 , y 22 ], that is, Taking (x 11 , y 11 ) as the position coordinates of the upper left corner vertex of the predicted position region ROI DVS , and taking (x 22 , y 22 ) as the position coordinates of the lower right corner vertex of the predicted position region ROI DVS , the predicted position region is determined ROI DVS.
  • the method for determining the contour area may further include: The event appearing frame and/or the event disappearing frame is processed to remove noise points.
  • the event occurrence matrix corresponding to the event occurrence frame and the event disappearance matrix corresponding to the event disappearance frame are both sparse matrices. Due to the sensitivity of the dynamic vision sensor, sparse noise will also appear in the background area except for moving objects in the picture. Therefore, it is necessary to remove the sparse noise points. Specifically, an erosion operation and a dilation operation are performed on the pixel points of the non-zero pixel value in the event appearance frame and/or the event disappearance frame, so as to realize the removal of noise points, so that in the binarized event appearance frame and/or event disappearance frame When detecting the contour area of the moving object on the frame, it can effectively improve the influence of noise points and improve the accuracy of detecting the contour area.
  • S140 Determine, according to the predicted location area, a positioning area in the image information that matches the predicted location area.
  • the predicted position area of the moving object in the sampling event frame of the dynamic vision sensor is determined, if the resolutions of the dynamic vision sensor and the target camera component are the same, it means that the sampling event frame sampled by the dynamic vision sensor and the image information obtained by the target camera component are not identical. If the resolution is the same, then the predicted location area in the sampling event frame and the positioning area in the image information are the same area, and the image to be detected with the same shooting time, same shooting position and shooting angle as the sampling event frame is obtained in the image information, And according to the predicted position area, the same area in the image to be detected as the predicted position area is directly used as the positioning area;
  • FIG. 1D is a schematic flowchart of a method for determining a location area in image information that matches a predicted location area in an embodiment of the present disclosure.
  • step S140 according to the prediction
  • the location area which determines the location area in the image information that matches the predicted location area, may further include steps S141 to S143.
  • Step S141 acquiring the proportional relationship between the resolutions of the dynamic vision sensor and the target camera assembly.
  • Step S142 scaling the predicted location area according to the proportional relationship.
  • Step S143 Map the zoomed predicted location area to the image information to determine a location area matching the preset location area.
  • the proportional relationship between the resolutions of the dynamic vision sensor and the target camera assembly includes the ratio of the horizontal resolution (horizontal pixels) of the dynamic vision sensor to the horizontal resolution (horizontal pixels) of the target camera assembly, And the ratio of the vertical resolution (vertical pixels) of the dynamic vision sensor to the vertical resolution (vertical pixels) of the target camera assembly.
  • the horizontal direction of the dynamic vision sensor is The ratio of the resolution (horizontal pixels) to the horizontal resolution (horizontal pixels) of the target camera assembly is 1024/1280, and the vertical resolution (vertical pixels) of the dynamic vision sensor and the vertical resolution of the target camera assembly ( vertical pixels) ratio is 648/960.
  • step S142 the ratio between the horizontal resolution of the dynamic vision sensor and the horizontal resolution of the target camera assembly is used as the horizontal adjustment factor
  • the ratio of the vertical resolution of the dynamic vision sensor to the vertical resolution of the target camera component is used as the vertical adjustment factor Adjust the factor according to the horizontal direction and vertical adjustment factor
  • the predicted position region ROI DVS is scaled in the horizontal direction and the vertical direction to obtain the scaled predicted position region ROI.
  • the scaled predicted location area can be represented by the following formula:
  • step S143 the zoomed predicted position area is mapped to the image information, and the same area as the zoomed predicted position area in the image information is the matching positioning area, thereby determining the preset position area.
  • the matching positioning area where the positioning area of the moving object in the image information can also be represented by the following formula:
  • the predicted position area of the moving object in the sampling event frame is determined, and the matching is determined in the image information of the target camera component. It improves the positioning efficiency of moving objects, especially the real-time detection of high-speed moving objects.
  • FIG. 2 is a schematic flowchart of another method for locating a moving object provided by an embodiment of the present disclosure.
  • the trained image classification model can identify and classify the positioning area to determine whether there is a moving object in the image information, so as to realize the identification, classification and tracking of the moving object in the image information.
  • the positioning method may include the following steps: step S210 to step S250.
  • step S210 For the specific description of step S210, reference may be made to the above description of step S110, and details are not repeated here.
  • S220 Sample the event stream information according to a preset sampling period to obtain a sampled event frame.
  • step S220 For the specific description of step S220, reference may be made to the above description of step S120, which will not be repeated here.
  • Step S230 Determine the predicted position area of the moving object in the sampling event frame according to the event stream information corresponding to the sampling event frame.
  • step S230 For the specific description of step S230, reference may be made to the above description of step S130, which will not be repeated here.
  • step S240 For a specific description of step S240, reference may be made to the above description of step S140, which will not be repeated here.
  • the image classification model is a classification model that is pre-trained based on sample images. Its function is to extract image features and obtain feature vectors for the image data of the input positioning area, and then output the corresponding image classification according to the obtained feature vectors. Probability, where the image classification probability represents the probability that the image data of the input positioning area is a positive sample or a negative sample, and then classify according to the image classification probability (ie binary classification) to determine whether there is motion in the image data of the input positioning area Object, realize the recognition and classification of moving objects in the positioning area of image information.
  • the image classification probability represents the probability that the image data of the input positioning area is a positive sample or a negative sample
  • the image features can include the color features, texture features, shape features and spatial relationship features of the image; the color features describe the surface properties of the scene corresponding to the image or the image area, and are based on the features of pixels; the texture features describe the image or The surface properties of the scene corresponding to the image area require statistical calculation in an area containing multiple pixels; the shape feature describes the contour features of the outer boundary of the object and the overall regional features; the spatial relationship feature is the segmentation of the video image.
  • the mutual spatial position or relative direction relationship between the multiple targets such as connection relationship, overlapping relationship, and inclusion relationship, etc.
  • the types of the extracted image features are not specifically limited.
  • the method before identifying and classifying the positioning area in the image information according to the pre-trained image classification model, the method further includes: judging whether the number of pixels in the positioning area is greater than a preset detection threshold . Identify and classify the positioning area in the image information according to the pre-trained image classification model, including: if the number of pixels in the positioning area is greater than the preset detection threshold, then according to the pre-trained image classification model, the positioning area Regions are identified and classified.
  • the number of pixels in the location area is less than or equal to the preset detection threshold, no further processing is performed on the location area.
  • the preset detection threshold in order to avoid misdetecting a small interfering object (eg, flying insect) as the target moving object to be monitored (eg, for the monitoring of high-altitude parabolas, the high-altitude parabola is the target movement to be monitored) object), the preset detection threshold can be set to a larger value to effectively prevent false detection of interfering objects.
  • the preset detection threshold in order to improve the detection accuracy of moving objects in image information, can also be set to a small value, for example, set to 0, that is, there are pixels that change in the positioning area When the point is clicked, the corresponding positioning area will be identified and classified by the image classification model.
  • the preset detection threshold is set so that only when the screen is detected, the number of pixels whose light intensity changes exceeds the preset threshold value.
  • the image feature extraction calculation is performed by the image classification model, and only the location area in the image information needs to be processed, which effectively improves the efficiency of identifying and analyzing moving objects, effectively saves computing resources, and reduces computational costs. pressure and improve computational efficiency.
  • the method before identifying and classifying the positioning area in the image information according to the pre-trained image classification model, the method further includes: acquiring a sample image set, and using the sample image set to classify the image classification model Perform image classification training to obtain a pre-trained image classification model;
  • the image classification model is constructed based on neural network, and the image recognition model is a mathematical model based on neural network (Neural Networks, NNS).
  • NNS neural Networks
  • the image classification model is trained by the sample image set composed of positive sample images and negative sample images, so that the trained image classification model has the ability to output the corresponding image classification probability according to the image data of the input positioning area, and then output The category judgment result of the image data of the input positioning area.
  • the predicted position area of the moving object in the sampling event frame is determined, and the matching is determined in the image information of the target camera component.
  • image recognition and classification processing are performed on the image of the positioning area to determine whether there is a moving object in the image. While realizing the positioning of the moving object, the detection accuracy of the moving object in the image information is improved, which is beneficial to Reduce the occurrence of false detection of moving objects.
  • FIG. 3 is a structural block diagram of an apparatus for positioning a moving object provided by an embodiment of the present disclosure.
  • the apparatus specifically includes: an information acquisition module 310 , a sampling execution module 320 , and a classification execution module 330 .
  • the information acquisition module 310 is configured to acquire event stream information through a dynamic vision sensor, and acquire image information through a target camera assembly.
  • the sampling execution module 320 is configured to sample the event stream information according to the preset sampling period to obtain the sampled event frame; and determine the predicted location area of the moving object in the sampled event frame according to the event stream information corresponding to the sampled event frame.
  • the classification execution module 330 is configured to determine, according to the predicted location area, a positioning area in the image information that matches the predicted location area.
  • the device for positioning a moving object provided by the embodiment of the present disclosure, after the event flow information is obtained through the dynamic vision sensor, the predicted position area of the moving object in the sampling event frame is determined, and the matching is determined in the image information of the target camera component. It improves the positioning efficiency of moving objects, especially the real-time detection of high-speed moving objects.
  • the sampling execution module 320 is configured to determine the contour area of the moving object in the sampling event frame according to the event flow information corresponding to the sampling event frame, and mark the contour area through the area of interest frame to obtain the motion The predicted location area of the object.
  • the sampling execution module 320 may further include: a frame processing unit, a prediction area acquisition unit, and a contour area acquisition unit.
  • the frame processing unit is used for acquiring the event appearing frame and the event disappearing frame according to the event stream information corresponding to the sampled event frame.
  • the predicted area obtaining unit is used for determining the predicted appearing area of the moving object according to the event appearing frame, and determining the predicted disappearing area of the moving object according to the event disappearing frame.
  • the contour area acquisition unit is used for determining the contour area of the moving object according to the predicted appearance area and the predicted disappearance area.
  • the event stream information corresponding to the sampling event frame includes event information of multiple pixels, and the event information of the pixel includes at least one labeling event;
  • the frame acquisition unit is configured to: In the event information, the pixels corresponding to the marked events marked as positive events are determined as the event occurrence pixels; and in the event information corresponding to multiple pixels, the pixels corresponding to the marked events marked as negative events are determined. is the event disappearing pixel; the event appearing frame is generated according to all the event appearing pixels, and the event disappearing frame is generated according to all the event disappearing pixels.
  • the pixel resolutions of the event occurrence frame, the event disappearance frame and the sampling event frame are the same; in the event occurrence frame, the pixel values corresponding to all event occurrence pixels are set as the first pixel value, and all non-event The pixel value corresponding to the event occurrence pixel is set to the second pixel value; in the event disappearing frame, the pixel value corresponding to all event disappearing pixels is set to the first pixel value, and the pixel values corresponding to all non-event disappearing pixels are set to the first pixel value. Two pixel values.
  • the prediction area acquisition unit is configured to: determine the predicted appearance area of the moving object according to the positions of all the pixel points in the event appearance frame with the pixel value of the first pixel value, and according to the event disappearance frame
  • the pixel value is the position of all pixel points of the first pixel value, and the predicted disappearance area of the moving object is determined.
  • the classification execution module 330 is configured to: obtain a proportional relationship between the resolutions of the dynamic vision sensor and the target camera assembly; perform scaling processing on the predicted location area according to the proportional relationship; The predicted location area is mapped into the image information to determine the matching localization area.
  • the device for positioning a moving object further includes: a classification processing execution module, which is configured to identify and classify the positioning area in the image information according to the pre-trained image classification model, To determine whether there is a moving object in the image information.
  • a classification processing execution module which is configured to identify and classify the positioning area in the image information according to the pre-trained image classification model, To determine whether there is a moving object in the image information.
  • the apparatus for locating a moving object further includes: a judgment execution module configured to judge whether the number of pixels in the positioning area is greater than a preset detection threshold.
  • the classification execution module 330 is configured to identify the location area in the image information according to the pre-trained image classification model if the number of pixels in the location area is greater than the preset detection threshold and classification processing.
  • the apparatus for positioning a moving object further includes: a pre-training execution module, the pre-training execution module is configured to obtain a sample image set, and perform image classification training on the image classification model through the sample image set, so as to obtain a pre-training execution module.
  • the image classification model that has been trained; wherein, the image classification model is constructed based on a neural network.
  • the above-mentioned positioning device can execute the positioning method of a moving object provided by any embodiment of the present disclosure, and has corresponding functional modules and beneficial effects for executing the method.
  • FIG. 4 is a structural block diagram of an electronic device provided by an embodiment of the present disclosure.
  • FIG. 4 shows a structural block diagram of an exemplary electronic device 12 suitable for implementing the positioning method described in the embodiment of the present disclosure.
  • the electronic device 12 shown in FIG. 4 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • the electronic device 12 takes the form of a general-purpose computing device.
  • Components of electronic device 12 may include, but are not limited to, one or more processors or processing units 16 , memory 28 , and a bus 18 connecting various system components including memory 28 and processing unit 16 .
  • Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a graphics acceleration port, a processor, or a local bus using any of a variety of bus structures.
  • these architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MAC) bus, Enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect ( PCI) bus.
  • Electronic device 12 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by electronic device 12, including both volatile and non-volatile media, removable and non-removable media.
  • Memory 28 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 30 and/or cache memory 32 .
  • Electronic device 12 may further include other removable/non-removable, volatile/non-volatile computer system storage media.
  • storage system 34 may be used to read and write to non-removable, non-volatile magnetic media (not shown in FIG. 4, commonly referred to as a "hard drive”).
  • a disk drive may be provided for reading and writing to removable non-volatile magnetic disks (eg "floppy disks"), as well as removable non-volatile optical disks (eg CD-ROM, DVD-ROM) or other optical media) to read and write optical drives.
  • each drive may be connected to bus 18 through one or more data media interfaces.
  • Memory 28 may include at least one program product having a set (eg, at least one) of program modules configured to perform the functions of various embodiments of the present disclosure.
  • a program/utility 40 having a set (at least one) of program modules 42, which may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data , each or some combination of these examples may include an implementation of a network environment.
  • Program modules 42 generally perform the functions and/or methods of the embodiments described in this disclosure.
  • the electronic device 12 may also communicate with one or more external devices 14 (eg, a keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with the electronic device 12, and/or with Any device (eg, network card, modem, etc.) that enables the electronic device 12 to communicate with one or more other computing devices. Such communication may take place through input/output (I/O) interface 22 . Also, the electronic device 12 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 20 . As shown, network adapter 20 communicates with other modules of electronic device 12 via bus 18 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives and data backup storage systems.
  • the processing unit 16 executes various functional applications and data processing by running the programs stored in the memory 28, for example, implementing the method for positioning a moving object provided by any embodiment of the present disclosure. That is: obtain event stream information through the dynamic vision sensor, and obtain image information through the target camera component; sample the event stream information according to the preset sampling period to obtain the sampled event frame, and according to the event stream information corresponding to the sampled event frame, Determine the predicted location area of the moving object in the sampling event frame; determine the location area in the image information that matches the predicted location area according to the predicted location area.
  • An embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, implements the method for locating a moving object according to any embodiment of the present disclosure; the method includes: by The dynamic vision sensor obtains event stream information, and obtains image information through the target camera component; samples the event stream information according to a preset sampling period to obtain sampled event frames; The predicted location area of the moving object; according to the predicted location area, determine the location area in the image information that matches the predicted location area.
  • the computer storage medium of the embodiments of the present disclosure may adopt any combination of one or more computer-readable media.
  • the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or any combination of the above.
  • a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a propagated data signal in baseband or as part of a carrier wave, with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

一种运动物体的定位方法、装置、电子设备及存储介质,该方法包括:通过动态视觉传感器获取事件流信息,以及通过目标摄像组件获取图像信息(S110);根据预设采样周期对事件流信息进行采样,以获取采样事件帧(S120);根据采样事件帧对应的事件流信息,确定采样事件帧中运动物体的预测位置区域(S130);根据预测位置区域,确定图像信息中与预测位置区域匹配的定位区域(S140)。该方法提高了运动物体的定位效率,尤其是提高了针对高速运动物体的检测实时性。

Description

运动物体的定位方法、装置、电子设备及存储介质 技术领域
本公开实施例涉及图像识别技术领域,尤其涉及一种运动物体的定位方法、装置、电子设备及存储介质。
背景技术
随着科技的不断进步,图像识别技术得到了迅速发展,被广泛应用于各个领域,其中对于图像中高速运动物体的定位,成为了图像识别技术的重要分支。
在相关技术中,通常是针对摄像组件获取的视频图像,直接在全局图像中进行特征提取,并根据提取到的图像特征在图像中对运动物体进行定位。
但是这样的图像识别方式,图像特征的提取计算量极大,且由于摄像组件的拍摄画面中,大多数情况下保持静止状态,常常造成计算资源浪费,同时,通过图像特征,比较物体在不同图像帧之间的位置来判断是否存在运动物体,难以保证图像识别的实时性,尤其对于高速运动的物体,定位效果较差。
发明内容
本公开实施例提供了一种运动物体的定位方法、装置、电子设备及存储介质。
第一方面,本公开实施例提供了一种运动物体的定位方法,该定位方法包括:通过动态视觉传感器获取事件流信息,以及通过目标摄像组件获取图像信息;根据预设采样周期对事件流信息进行采样,以获取采样事件帧;根据所述采样事件帧对应的事件流信息,确定所述采样事件帧中运动物体的预测位置区域;根据所述预测位置区域,确定所述图像信息中与所述预测位置区域匹配的定位区域。
第二方面,本公开实施例提供了一种运动物体的定位装置,该定位装置包括:信息获取模块,用于通过动态视觉传感器获取事件流信息,以及通过目标摄像组件获取图像信息;采样执行模块,用于根据预设采样周期,对所述事件流信息进行采样,以获取采样事件帧,并根据所述采样事件帧对应的事件流信息,确定所述采样事件帧中运动物体的预测位置区域;分类执行模块,用于根据所述预测位置区域,确定所述图像信息中与所述预测位置区域匹配的定位区域。
第三方面,本公开实施例提供了一种电子设备,该电子设备包括:一个或多个处理器;存储器,用于存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本公开任意实施例所述的运动物体的定位方法。
第四方面,本公开实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现本公开任意实施例所述的运动物体的定位方法。
根据本公开实施例提供的运动物体的定位方法及装置、电子设备、计算机可读存储介质的技术方案,通过动态视觉传感器获取到事件流信息后,确定采样事件帧中运动物体的预测位置区域,并在目标摄像组件的图像信息中确定匹配的定位区域,提高了运动物体的定位效率,尤其是提高了针对高速运动物体的检测实时性。
附图说明
图1A是本公开实施例提供的一种运动物体的定位方法的流程示意图;
图1B是本公开实施例中一种确定采样事件帧中运动物体的轮廓区域的方法的流程示意图;
图1C是本公开实施例提供的运动物体的预测位置区域的示意图;
图1D是本公开实施例中一种确定图像信息中与预测位置区域匹配的定位区域的方法的流程示意图;
图2是本公开实施例提供的另一种运动物体的定位方法的流程示意图;
图3是本公开实施例提供的一种运动物体的定位装置的结构框图;
图4是本公开实施例提供的一种电子设备的结构框图。
具体实施方式
下面结合附图和实施例对本公开作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释本公开,而非对本公开的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本公开相关的部分而非全部结构。
图1A为本公开实施例提供的一种运动物体的定位方法的流程示意图,本公开实施例可用于检测目标摄像组件拍摄的图像信息中是否存在运动物体,对目标摄像组件拍摄的图像信息中的运动物体进行定位、识别和分类,该方法可以由本公开实施例中的运动物体的定位装置来执行,该装置可以通过软件和/或硬件实现,并集成在电子设备中,该方法具体包括如下步骤:步骤S110~步骤S140。
步骤S110、通过动态视觉传感器获取事件流信息,以及通过目标摄像组件获取图像信息。
动态视觉传感器(Dynamic Vision Sensor,DVS),是一种采用像素异步机制,并基于地址和事件表达(AER)的图像采集装置;相较于相关技术中以固定频率采集的“帧”为基础,并依次读取各“帧”中所有的像素信息的方案,DVS不需要对画面中的所有像素点进行读取,仅需要获取光强度变化的像素点的地址和信息;具体的,当动态视觉传感器检测到某个像素点的光强度变化大于等于预设门限数值时,则发出该像素点的事件信号;其中,如果该光强度变化为正向变化,即该像素点由低亮度跳变至高亮度,则发出用“+1”表示的事件信号,并标注为正事件;如果该光强度变化为负向变化,即该像素点由高亮度跳变至低亮度,则发出用“-1”表示的事件信号,并标注为负事件;如果光强度变化小于预设门限数值,则不发出事件信号,标注为无事件;动态视觉传感器通过对各像素点进行的事件标注,以构成事件流信息。
目标摄像组件是将光学图像信号转变为电信号,进而进行电信号存储或传输的拍摄装置,可以包括多种类型拍摄装置,例如,高速图像采集(High-speed Image Acquisition)设备和监控摄像头;其中,高速图像采集设备是一种用于高速采集并获取数字化视频图像信息的图像采集设备,可以将获取到的图像数据流,按照预先安排好的路径传输、显示和存储;在本公开实施例中,高速图像采集设备,对可见光范围内的RGB(红、绿、蓝三通道)图像进行快速捕获,生成高速的画面帧,以确保对高速运动物体轨迹的获取,其生成画面帧的帧率可以达到每秒一千帧至十万帧的量级。
在本公开实施例中,在步骤S110中,通过动态视觉传感器获取目标场景的事件流信息,以及通过目标摄像组件获取该目标场景的图像信息。换言之,事件流信息和图像信息是针对相同场景的拍摄画面,且拍摄画面的内容相同;事件流信息和图像信息可以是在同一时刻下获取的,也可以先通过动态视觉传感器获取事件流信息,并在采样事件帧中进行定位后,再通过目标摄像组件获取图像信息。
为了确保动态视觉传感器和目标摄像组件拍摄的画面内容相同,动态视觉传感器和目标摄像组件可以设置在邻近的拍摄位置(例如,可将动态视觉传感器和目标摄像组件集成在同一台电子设备中),以使动态视觉传感器和目标摄像组件两种设备的摄像头足够接近,以改善拍摄角度视差,且可以通过调节该两种设备的摄像头的拍摄角度,以确保能够获取到相同场景的拍摄画面。
步骤S120、根据预设采样周期对事件流信息进行采样,以获取采样事件帧。
步骤S130、根据采样事件帧对应的事件流信息,确定采样事件帧中运动物体的预测位置区域。
相比于光亮强度变化较小的背景图像,画面中运动物体经过的区域,其对应的像素点的光亮强度会存在不同程度的变化,例如,运动物体出现时,运动物体出现区域的像素点的光亮强度会显著增加,运动物体消失时,运动物体消失区域的像素点的光亮强度会显著降低,因此,根据事件流信息,可以确定画面中哪些像素点可能存在运动物体。具体的,在预设采样周期内,如果某个像素点的事件流信息中包括正事件或负事件,则该像素点可能是与运动物体相关的像素点;采样事件帧是在预设采样周期内,对每个像素点的所有标注事件进行汇总后显示的图像帧,采样事件帧的事件流信息包括多个像素点对应的事件信息,每个像素点对应的事件信息包括至少一个标注事件,根据采样事件帧中多个像素点的标注事件(如正事件、负事件),可以预测得到运动物体的位置区域。
其中,预设采样周期可以根据实际需要设定,例如,为了提高事件流信息中运动物体的检测效率,可以将预设采样周期设定为较低数值;为了降低图像处理压力,则可以将预设采样周期设定为较高数值;特别的,由于DVS的检测精度较高,对于像素点的事件信号的检测可以达到纳秒级(例如,1000纳秒,即每间隔1000纳秒获取一次像素点的事件信号),而预设采样周期通常设定为毫秒级(例如,10毫秒),因此,在一个采样周期内,一个像素点的光强度可能经历了多次变化,即DVS针对一个像素点发出了多个事件信号,那么只要在预设采样周期内,该像素点的事件信息包括至少一个正事件和/或负事件,那么该像素点即包含在运动物体的预测位置区域中。
在本公开的一些实施例中,在步骤S130中,根据采样事件帧对应的事件流信息,确定采样事件帧中运动物体的预测位置区域,可以进一步包括:根据采样事件帧对应的事件流信息,确定采样事件帧中运动物体的轮廓区域,并通过感兴趣区域框标注轮廓区域,以获取运动物体的预测位置区域。
其中,感兴趣区域(Region Of Interest,ROI)是以方框、圆、椭圆和多边形等方式勾勒出来需要处理的区域,由于运动物体的轮廓信息通常为不规则图形,在图像中不便于定位,在本公开的一些实施例中,感兴趣区域框可以通过矩形标注框的方式,在图像中标注出同时包含运动物体出现轮廓和运动物体消失轮廓的最小矩形,而矩形标注框及矩形标注框内的区域,即为运动物体的预测位置区域。其中,对于运动物体的轮廓区域,可以在采样事件帧中,通过目标检测算法获取,例如,通过滑动窗口检测器或R-CNN(Regions with CNN features,基于卷积神经网络的区域特征)获取。
图1B是本公开实施例中一种确定采样事件帧中运动物体的轮廓区域的方法的流程示意图,在本公开的一些实施例中,如图1B所示,在步骤S130中,根据采样事件帧对应的事件流信息,确定采样事件帧中运动物体的轮廓区域,可以进一步包括:步骤S131~步骤S133。
步骤S131、根据采样事件帧对应的事件流信息,获取事件出现帧和事件消失帧。
如前所述,采样事件帧的事件流信息包括采样事件帧中多个像素点对应的事件信息,每个像素点对应的事件信息包括至少一个标注事件,标注事件包括标注为正事件或者负事件的事件。
作为一种可选的实施方式,在步骤S131中,根据采样事件帧对应的事件流信息,获取事件出现帧和事件消失帧,可以进一步包括:将多个像素点对应的事件信息中,被标注为正事件的标注事件所对应的像素点确定为事件出现像素点;并且将多个像素点对应的事件信息中,被标注为负事件的标注事件所对应的像素点确定为事件消失像素点;根据所有事件出现像素点生成事件出现帧,并根据所有事件消失像素点生成事件消失帧。
可以理解的是,采样事件帧描述所有像素点的事件信息,事件出现帧描述所有正事件对应的像素点的信息,而事件消失帧描述所有负事件对应的像素点的信息。
其中,事件出现帧、事件消失帧与采样事件帧的像素分辨率相同,而采样事件帧的像素分辨率与动态视觉传感器DVS的分辨率相同。
在事件出现帧中,所有事件出现像素点对应的像素值设置为第一像素值,所有非事件出现像素点对应的像素值设置为第二像素值。在事件消失帧中,所有事件消失像素点对应的像素值设置为第一像素值,所有非事件消失像素点对应的像素值设置为第二像素值。示例性的,第一像素值可以设置为最大像素值,即255,而第二像素值可以设置为最小像素值,即0。
在本公开的一些实施例中,可以通过事件出现矩阵表示事件出现帧,通过事件消失矩阵表示事件消失帧。其中,事件出现矩阵中每个元素对应事件出现帧的每个像素点,且位置对应设置,事件出现矩阵中每个元素的值为对应的像素点的像素值;同理,事件消失矩阵中每个元素对应事件消失帧的每个像素点,且位置对应设置,事件消失矩阵中每个元素的值为对应的像素点的像素值。
初始时,可以预置一个事件出现空矩阵和一个事件消失空矩阵。其中,事件出现空矩阵中的每个元素的值初始化为第二像素值(如0),事件出现空矩阵中的元素行数与采样事件帧的像素分辨率中像素行数对应一致,事件出现空矩阵中的元素列数与采样事件帧的 像素分辨率中像素列数对应一致,事件出现空矩阵中每个元素对应一个像素点;事件消失空矩阵中的每个元素的值初始化为第二像素值(如0),事件消失空矩阵中的元素行数与采样事件帧的像素分辨率中像素行数对应一致,事件消失空矩阵中的元素列数与采样事件帧的像素分辨率中像素列数对应一致,事件消失空矩阵中每个元素对应一个像素点。示例性的,动态视觉传感器的分辨率为1024(水平像素)×648(垂直像素),则相应的,事件出现空矩阵和事件消失空矩阵均为1024(行)×648(列)矩阵。
根据预设采样周期内采样事件帧中每个像素点的标注事件,对事件出现空矩阵和事件消失空矩阵中的各元素进行赋值,以获取事件出现矩阵和事件消失矩阵。
示例性的,事件出现空矩阵和事件消失空矩阵中各元素的值均初始化为第二像素值,例如0。若在预设采样周期内,获取到一个像素点的标注事件为正事件时,则在事件出现空矩阵中,将与该像素点对应的元素赋值为第一像素值(即赋值为255);通过将预设采样周期内所有包含正事件的像素点所对应的事件出现空矩阵中的元素均赋值为第一像素值(即255),而所有包含负事件或无事件的像素点所对应的元素的值则保持为第二像素值(即0),从而得到事件出现矩阵;在获取的事件出现矩阵中,值为255的元素在画面中的位置表示在预设采样周期出现运动物体时运动物体的边缘的出现轨迹,因此,根据事件出现矩阵可以在图像中获取高亮的运动物体的出现轮廓。
同理,若在预设采样周期内,获取到一个像素点的标注事件为负事件时,则在事件消失空矩阵中,将与该像素点对应的元素赋值为第一像素数值(即赋值为255);通过将预设采样周期内所有包含负事件的像素点所对应的事件消失空矩阵中的元素均赋值为第一像素值(即255),而所有包含正事件或无事件的像素点所对应的元素的值则保持为第二像素值(即0),从而得到事件消失矩阵;在获取的事件消失矩阵中,值为255的元素在画面中的位置表示在预设采样周期出现运动物体时运动物体的边缘的消失轨迹,因此,根据事件消失矩阵可以在图像中获取高亮的运动物体消失轮廓。最终将运动物体的出现轮廓和运动物体的消失轮廓的并集共同作为运动物体的轮廓信息。
步骤S132、根据事件出现帧确定运动物体的预测出现区域,根据事件消失帧确定运动物体的预测消失区域。
在步骤S132中,根据事件出现帧中像素值为第一像素值的所有像素点的位置确定运动物体的预测出现区域,该运动物体的预测出现区域即为前述运动物体的出现轮廓所在的区域;根据事件消失帧中像素值为第一像素值的所有像素点的位置确定运动物体的预测消失区域,该运动物体的预测消失区域即为前述运动物体的消失轮廓所在的区域。其中,像 素点的位置可以采用二维位置坐标进行表示。
步骤S133、根据预测出现区域和预测消失区域,确定运动物体的轮廓区域。
其中,该运动物体的预测出现区域即为前述运动物体的出现轮廓所在的区域,可称为运动物体的出现轮廓区域;运动物体的预测消失区域即为前述运动物体的消失轮廓所在的区域,可称为运动物体的消失轮廓区域。通过将运动物体的出现轮廓区域和运动物体的消失轮廓区域进行组合,以作为运动物体的轮廓区域。
如前所述,在确定运动物体的轮廓区域后,通过感兴趣区域框标注轮廓区域,以获取运动物体的预测位置区域。图1C是本公开实施例提供的运动物体的预测位置区域的示意图,如图1C所示,以采样事件帧的左上角像素点为原点建立坐标系,运动物体的出现轮廓区域对应的感兴趣区域ROI 1可表示为ROI 1=[x 11,y 11,x 12,y 12],(x 11,y 11)和(x 12,y 12)分别为该运动物体的出现轮廓区域的左上角顶点A1(对应区域中左上角的像素点)的二维位置坐标和右下角顶点B1(对应区域中右下角的像素点)的二维位置坐标。
运动物体的消失轮廓区域对应的感兴趣区域ROI 2可表示为ROI 2=[x 21,y 21,x 22,y 22],(x 21,y 21)和(x 22,y 22)分别为该运动物体的消失轮廓区域的左上角顶点A2(对应区域中左上角的像素点)的二维位置坐标和右下角顶点B2(对应区域中右下角的像素点)的二维位置坐标;采样事件帧中运动物体的预测位置区域ROI DVS可由如下公式表示:
ROI DVS=[min(x 11,x 21),min(y 11,y 21),max(x 12,x 22),max(y 12,y 22)]。
示例性的,当x 11小于x 21,y 11小于y 21,x 12小于x 22,y 21小于y 22时,此时ROI DVS=[x 11,y 11,x 22,y 22],即以(x 11,y 11)作为预测位置区域ROI DVS的左上角顶点的位置坐标,以(x 22,y 22)作为预测位置区域ROI DVS的右下角顶点的位置坐标,从而确定出预测位置区域ROI DVS
在本公开的一些实施例中,在上述步骤S131中获取事件出现帧(事件出现矩阵)和事件消失帧(事件消失矩阵)之后,在步骤S132之前,确定轮廓区域的方法还可以进一步包括:对事件出现帧和/或事件消失帧,进行噪声点去除处理。
通常情况下,事件出现帧所对应的事件出现矩阵和事件消失帧所对应的事件消失矩阵均为稀疏矩阵,由于动态视觉传感器的敏感性,画面中除了运动物体外的背景区域也会出现稀疏噪声点,因此需要进行稀疏噪声点的去除处理。具体的,对事件出现帧和/或事件消失帧中的非零像素值的像素点进行腐蚀操作和膨胀操作,从而实现噪声点的去除,以便在二值化的事件出现帧和/或事件消失帧上检测运动物体的轮廓区域时,有效改善噪声点造成的影响,提高检测轮廓区域的精确性。
S140、根据预测位置区域,确定图像信息中与预测位置区域匹配的定位区域。
在确定了动态视觉传感器的采样事件帧中运动物体的预测位置区域后,如果动态视觉传感器和目标摄像组件的分辨率相同,表示动态视觉传感器采样的采样事件帧和目标摄像组件获取的图像信息的分辨率相同,那么采样事件帧中的预测位置区域与图像信息中的定位区域即为相同区域,在图像信息中获取与采样事件帧具有相同拍摄时刻、相同拍摄位置和拍摄角度的待检测图像,并根据预测位置区域,将待检测图像中的与预测位置区域相同的区域直接作为定位区域;
如果动态视觉传感器和目标摄像组件的分辨率不同,需要先确定动态视觉传感器和目标摄像组件的分辨率之间的比例关系,根据比例关系和采样事件帧中的预测位置区域,确定运动物体在图像信息的待检测图像中的定位区域。
图1D是本公开实施例中一种确定图像信息中与预测位置区域匹配的定位区域的方法的流程示意图,在本公开的一些实施例中,如图1D所示,在步骤S140中,根据预测位置区域,确定图像信息中与预测位置区域匹配的定位区域,可以进一步包括:步骤S141~步骤S143。
步骤S141、获取动态视觉传感器和目标摄像组件的分辨率之间的比例关系。
步骤S142、根据比例关系对预测位置区域进行缩放处理。
步骤S143、将经缩放处理后的预测位置区域映射到图像信息中,以确定出与预设位置区域匹配的定位区域。
在步骤S141中,动态视觉传感器和目标摄像组件的分辨率之间的比例关系包括动态视觉传感器的水平方向分辨率(水平像素)与和目标摄像组件的水平方向分辨率(水平像素)的比值,以及动态视觉传感器的垂直方向分辨率(垂直像素)与和目标摄像组件的垂直方向分辨率(垂直像素)的比值。示例性的,假设动态视觉传感器的分辨率为1024(水平像素)×648(垂直像素),目标摄像组件的分辨率为1280(水平像素)×960(垂直像素),则动态视觉传感器的水平方向分辨率(水平像素)与和目标摄像组件的水平方向分辨率(水平像素)的比值为1024/1280,动态视觉传感器的垂直方向分辨率(垂直像素)与和目标摄像组件的垂直方向分辨率(垂直像素)的比值为648/960。
在步骤S142中,将动态视觉传感器的水平方向分辨率与和目标摄像组件的水平方向分辨率的比值,作为水平方向调节因子
Figure PCTCN2021140765-appb-000001
将动态视觉传感器的垂直方向分辨率与和目标摄像组件的垂直方向分辨率的比值,作为垂直方向调节因子
Figure PCTCN2021140765-appb-000002
根据水平方向调节因 子
Figure PCTCN2021140765-appb-000003
和垂直方向调节因子
Figure PCTCN2021140765-appb-000004
对预测位置区域ROI DVS进行水平方向和垂直方向上的缩放处理,得到经缩放处理后的预测位置区域ROI。经缩放处理后的预测位置区域可由如下公式表示:
Figure PCTCN2021140765-appb-000005
在步骤S143中,将经缩放处理后的预测位置区域映射到图像信息中,图像信息中与经缩放处理后的预测位置区域相同的区域即为匹配的定位区域,从而确定出与预设位置区域匹配的定位区域,其中,图像信息中运动物体的定位区域同样可由如下公式表示:
Figure PCTCN2021140765-appb-000006
根据本公开实施例提供的运动物体的定位方法的技术方案,通过动态视觉传感器获取到事件流信息后,确定采样事件帧中运动物体的预测位置区域,并在目标摄像组件的图像信息中确定匹配的定位区域,提高了运动物体的定位效率,尤其是提高了针对高速运动物体的检测实时性。
图2是本公开实施例所提供的另一种运动物体的定位方法的流程示意图,在本公开的一些实施例中,在确定图像信息中的定位区域之后,即在上述步骤S140之后,根据预训练完成的图像分类模型,对定位区域进行识别及分类处理,以确定图像信息中是否存在运动物体,从而实现对图像信息中的运动物体的识别、分类和跟踪。如图2所示,该定位方法可以包括以下步骤:步骤S210~步骤S250。
S210、通过动态视觉传感器获取事件流信息,以及通过目标摄像组件获取图像信息。
关于步骤S210的具体描述可参见上述对步骤S110的描述,此处不再赘述。
S220、根据预设采样周期对事件流信息进行采样,以获取采样事件帧。
关于步骤S220的具体描述可参见上述对步骤S120的描述,此处不再赘述。
步骤S230、根据采样事件帧对应的事件流信息,确定采样事件帧中运动物体的预测位置区域。
关于步骤S230的具体描述可参见上述对步骤S130的描述,此处不再赘述。
S240、根据预测位置区域,确定图像信息中与预测位置区域匹配的定位区域。
关于步骤S240的具体描述可参见上述对步骤S140的描述,此处不再赘述。
S250、根据预训练完成的图像分类模型,对图像信息中的定位区域进行识别及分类处理,以确定图像信息中是否存在运动物体。
其中,图像分类模型是基于样本图像预先训练完成的分类模型,其作用在于针对输入的定位区域的图像数据,进行图像特征的提取并获取特征向量,然后根据获取到的特征向量输出对应的图像分类概率,其中图像分类概率表示了输入的定位区域的图像数据为正样本或负样本的概率,进而根据该图像分类概率进行分类(即二值分类),确定输入的定位区域的图像数据是否存在运动物体,实现对图像信息的定位区域中运动物体的识别和分类。其中,图像特征可以包括图像的颜色特征、纹理特征、形状特征和空间关系特征;颜色特征描述了图像或图像区域所对应的景物的表面性质,是基于像素点的特征;纹理特征描述了图像或图像区域所对应景物的表面性质,其需要在包含多个像素点的区域中进行统计计算;形状特征则描述物体外边界的轮廓特征,以及整体上的区域特征;空间关系特征是视频图像中分割出来的多个目标之间的相互的空间位置或相对方向关系,例如,连接关系、重叠关系以及包含关系等。在本公开实施例中,对提取的图像特征的类型不作具体限定。
在本公开的一些实施例中,在根据预训练完成的图像分类模型,对图像信息中的定位区域进行识别及分类处理之前,还包括:判断定位区域中像素点的数量是否大于预设检测阈值。根据预训练完成的图像分类模型,对图像信息中的定位区域进行识别及分类处理,包括:若定位区域中像素点的数量大于预设检测阈值,则根据预训练完成的图像分类模型,对定位区域进行识别及分类处理。
在本公开的一些实施例中,若定位区域中像素点的数量小于或等于预设检测阈值,则不对该定位区域作进一步处理。
在本公开的一些实施例中,为了避免将体积较小的干扰物体(例如,飞虫)误检测为待监测的目标运动物体(例如,对于高空抛物的监测,高空抛物为待监测的目标运动物体),可以将预设检测阈值设定为较大数值,以有效防止对干扰物体的误检测。在本公开的一些实施例中,为了提高对图像信息中运动物体的检测精度,也可以将预设检测阈值设定为较小数值,例如,设定为0,即定位区域中存在变化的像素点时,即将对应的定位区域通过图像分类模型进行识别和分类处理。
由于目标摄像组件的拍摄画面中,大多数情况下保持静止状态,因此预设检测阈值的设定,使得只有在检测到屏幕中,存在光强度发生变化的像素点的数量超过预设门限数值的定位区域时,才会通过图像分类模型进行图像特征提取计算,且仅需处理图像信息中的定位区域,从而有效提高了对运动物体进行识别、分析的效率,有效节约了计算资源,减轻了计算压力,提高了计算效率。
在本公开的一些实施例中,在根据预训练完成的图像分类模型,对图像信息中的定位区域进行识别及分类处理之前,还包括:获取样本图像集合,并通过样本图像集合对图像分类模型进行图像分类训练,以获取预训练完成的图像分类模型;
其中,图像分类模型基于神经网络构建,图像识别模型是基于神经网络(Neural Networks,NNS)构建的数学模型,在预先建立的网络结构基础上,通过调整内部大量节点的连接关系,实现对信息的有效处理;样本图像集合中,正样本图像为包含运动物体的图像信息,正样本图像的输出值为1;负样本图像为不包含运动物体的图像信息,负样本图像的输出值为0。通过正样本图像和负样本图像组成的样本图像集合对图像分类模型进行模型训练,使得训练完成的图像分类模型具备了根据输入的定位区域的图像数据,输出对应的图像分类概率的能力,进而输出对输入的定位区域的图像数据的类别判断结果。
根据本公开实施例提供的运动物体的定位方法的技术方案,通过动态视觉传感器获取到事件流信息后,确定采样事件帧中运动物体的预测位置区域,并在目标摄像组件的图像信息中确定匹配的定位区域,进而根据图像分类模型对定位区域图像进行图像识别及分类处理,以确定图像中是否存在运动物体,在实现运动物体定位的同时,提高了图像信息中运动物体的检测精度,有利于减少对运动物体的误检测现象的发生。
图3是本公开实施例提供的一种运动物体的定位装置的结构框图,该装置具体包括:信息获取模块310、采样执行模块320和分类执行模块330。
其中,信息获取模块310,用于通过动态视觉传感器获取事件流信息,以及通过目标摄像组件获取图像信息。采样执行模块320,用于根据预设采样周期对事件流信息进行采样,以获取采样事件帧;根据采样事件帧对应的事件流信息,确定采样事件帧中运动物体的预测位置区域。分类执行模块330,用于根据预测位置区域,确定图像信息中与预测位置区域匹配的定位区域。
根据本公开实施例提供的运动物体的定位装置的技术方案,通过动态视觉传感器获取到事件流信息后,确定采样事件帧中运动物体的预测位置区域,并在目标摄像组件的图像信息中确定匹配的定位区域,提高了运动物体的定位效率,尤其是提高了针对高速运动物体的检测实时性。
在本公开的一些实施例中,采样执行模块320,用于根据采样事件帧对应的事件流信息,确定采样事件帧中运动物体的轮廓区域,并通过感兴趣区域框标注轮廓区域,以获取运动物体的预测位置区域。
在本公开的一些实施例中,采样执行模块320,可以进一步包括:帧处理单元、预测 区域获取单元和轮廓区域获取单元。
帧处理单元,用于根据采样事件帧对应的事件流信息,获取事件出现帧和事件消失帧。
预测区域获取单元,用于根据事件出现帧确定运动物体的预测出现区域,根据事件消失帧确定运动物体的预测消失区域。
轮廓区域获取单元,用于根据预测出现区域和预测消失区域,确定运动物体的轮廓区域。
在本公开的一些实施例中,采样事件帧对应的事件流信息包括多个像素点的事件信息,像素点的事件信息包括至少一个标注事件;帧获取单元用于:将多个像素点对应的事件信息中,被标注为正事件的标注事件所对应的像素点确定为事件出现像素点;并且将多个像素点对应的事件信息中,被标注为负事件的标注事件所对应的像素点确定为事件消失像素点;根据所有事件出现像素点生成事件出现帧,并根据所有事件消失像素点生成事件消失帧。
在本公开的一些实施例中,事件出现帧、事件消失帧与采样事件帧的像素分辨率相同;在事件出现帧中,所有事件出现像素点对应的像素值设置为第一像素值,所有非事件出现像素点对应的像素值设置为第二像素值;在事件消失帧中,所有事件消失像素点对应的像素值设置为第一像素值,所有非事件消失像素点对应的像素值设置为第二像素值。
在本公开的一些实施例中,预测区域获取单元用于:根据事件出现帧中像素值为第一像素值的所有像素点的位置,确定运动物体的预测出现区域,根据所述事件消失帧中像素值为第一像素值的所有像素点的位置,确定运动物体的预测消失区域。
在本公开的一些实施例中,分类执行模块330用于:获取动态视觉传感器和目标摄像组件的分辨率之间的比例关系;根据比例关系对预测位置区域进行缩放处理;将经缩放处理后的预测位置区域映射到图像信息中,以确定出匹配的定位区域。
在本公开的一些实施例中,运动物体的定位装置还包括:分类处理执行模块,分类处理执行模块用于根据预训练完成的图像分类模型,对图像信息中的定位区域进行识别及分类处理,以确定图像信息中是否存在运动物体。
在本公开的一些实施例中,运动物体的定位装置还包括:判断执行模块,判断执行模块用于判断定位区域中像素点的数量是否大于预设检测阈值。
在本公开的一些实施例中,分类执行模块330用于若所述定位区域中像素点的数量大于预设检测阈值,则根据预训练完成的图像分类模型,对图像信息中的定位区域进行识别及分类处理。
在本公开的一些实施例中,运动物体的定位装置还包括:预训练执行模块,预训练执行模块用于获取样本图像集合,并通过样本图像集合对图像分类模型进行图像分类训练,以获取预训练完成的图像分类模型;其中,图像分类模型基于神经网络构建。
上述定位装置可执行本公开任意实施例所提供的运动物体的定位方法,具备执行方法相应的功能模块和有益效果,未在本公开实施例中详尽描述的定位装置的技术细节,可参见上述本公开任意实施例提供的定位方法中相关的描述吗,此处不再赘述。
图4是本公开实施例提供的一种电子设备的结构框图。图4示出了适于用来实现本公开实施例所述的定位方法的示例性电子设备12的结构框图。图4所示的电子设备12仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图4所示,电子设备12以通用计算设备的形式表现。电子设备12的组件可以包括但不限于:一个或者多个处理器或者处理单元16,存储器28,连接不同***组件(包括存储器28和处理单元16)的总线18。
总线18表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器,***总线,图形加速端口,处理器或者使用多种总线结构中的任意总线结构的局域总线。举例来说,这些体系结构包括但不限于工业标准体系结构(ISA)总线,微通道体系结构(MAC)总线,增强型ISA总线、视频电子标准协会(VESA)局域总线以及***组件互连(PCI)总线。
电子设备12典型地包括多种计算机***可读介质。这些介质可以是任何能够被电子设备12访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。
存储器28可以包括易失性存储器形式的计算机***可读介质,例如随机存取存储器(RAM)30和/或高速缓存存储器32。电子设备12可以进一步包括其它可移动/不可移动的、易失性/非易失性计算机***存储介质。仅作为举例,存储***34可以用于读写不可移动的、非易失性磁介质(图4未显示,通常称为“硬盘驱动器”)。尽管图4中未示出,可以提供用于对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘(例如CD-ROM,DVD-ROM或者其它光介质)读写的光盘驱动器。在这些情况下,每个驱动器可以通过一个或者多个数据介质接口与总线18相连。存储器28可以包括至少一个程序产品,该程序产品具有一组(例如至少一个)程序模块,这些程序模块被配置以执行本公开各实施例的功能。
具有一组(至少一个)程序模块42的程序/实用工具40,可以存储在例如存储器28中,这样的程序模块42包括但不限于操作***、一个或者多个应用程序、其它程序模块 以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。程序模块42通常执行本公开所描述的实施例中的功能和/或方法。
电子设备12也可以与一个或多个外部设备14(例如键盘、指向设备、显示器24等)通信,还可与一个或者多个使得用户能与该电子设备12交互的设备通信,和/或与使得该电子设备12能与一个或多个其它计算设备进行通信的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口22进行。并且,电子设备12还可以通过网络适配器20与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器20通过总线18与电子设备12的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备12使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID***、磁带驱动器以及数据备份存储***等。
处理单元16通过运行存储在存储器28中的程序,从而执行各种功能应用以及数据处理,例如实现本公开任意实施例提供的运动物体的定位方法。也即:通过动态视觉传感器获取事件流信息,以及通过目标摄像组件获取图像信息;根据预设采样周期对事件流信息进行采样,以获取采样事件帧,并根据采样事件帧对应的事件流信息,确定采样事件帧中运动物体的预测位置区域;根据预测位置区域,确定图像信息中与预测位置区域匹配的定位区域。
本公开实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本公开任意实施例所述的运动物体的定位方法;该方法包括:通过动态视觉传感器获取事件流信息,以及通过目标摄像组件获取图像信息;根据预设采样周期对事件流信息进行采样,以获取采样事件帧;根据采样事件帧对应的事件流信息,确定采样事件帧中运动物体的预测位置区域;根据预测位置区域,确定图像信息中与预测位置区域匹配的定位区域。
本公开实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是但不限于:电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中, 计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行***、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行***、装置或者器件使用或者与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
注意,上述仅为本公开的较佳实施例及所运用技术原理。本领域技术人员会理解,本公开不限于这里所述的特定实施例,对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本公开的保护范围。因此,虽然通过以上实施例对本公开进行了较为详细的说明,但是本公开不仅仅限于以上实施例,在不脱离本公开构思的情况下,还可以包括更多其他等效实施例,而本公开的范围由所附的权利要求范围决定。

Claims (13)

  1. 一种运动物体的定位方法,其特征在于,所述定位方法包括:
    通过动态视觉传感器获取事件流信息,以及通过目标摄像组件获取图像信息;
    根据预设采样周期对所述事件流信息进行采样,以获取采样事件帧;
    根据所述采样事件帧对应的事件流信息,确定所述采样事件帧中运动物体的预测位置区域;
    根据所述预测位置区域,确定所述图像信息中与所述预测位置区域匹配的定位区域。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述采样事件帧对应的事件流信息,确定所述采样事件帧中运动物体的预测位置区域,包括:
    根据所述采样事件帧对应的事件流信息,确定所述采样事件帧中运动物体的轮廓区域,并通过感兴趣区域框标注所述轮廓区域,以获取运动物体的预测位置区域。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述采样事件帧对应的事件流信息,确定所述采样事件帧中运动物体的轮廓区域,包括:
    根据所述采样事件帧对应的事件流信息,获取事件出现帧和事件消失帧;
    根据所述事件出现帧确定运动物体的预测出现区域,根据所述事件消失帧确定运动物体的预测消失区域;
    根据所述预测出现区域和所述预测消失区域,确定运动物体的轮廓区域。
  4. 根据权利要求3所述的方法,其特征在于,所述采样事件帧对应的事件流信息包括多个像素点的事件信息,所述像素点的事件信息包括至少一个标注事件;
    所述根据所述采样事件帧对应的事件流信息,获取事件出现帧和事件消失帧,包括:
    将多个像素点对应的事件信息中,被标注为正事件的标注事件所对应的像素点确定为事件出现像素点;并且
    将多个像素点对应的事件信息中,被标注为负事件的标注事件所对应的像素点确定为事件消失像素点;
    根据所有事件出现像素点生成所述事件出现帧,并根据所有事件消失像素点生成所述事件消失帧。
  5. 根据权利要求4所述的方法,其特征在于,所述事件出现帧、所述事件消失帧与所述采样事件帧的像素分辨率相同;
    在所述事件出现帧中,所有事件出现像素点对应的像素值设置为第一像素值,所有非事件出现像素点对应的像素值设置为第二像素值;
    在所述事件消失帧中,所有事件消失像素点对应的像素值设置为第一像素值,所有非事件消失像素点对应的像素值设置为第二像素值。
  6. 根据权利要求5所述的方法,其特征在于,所述根据所述事件出现帧确定运动物体的预测出现区域,包括:根据所述事件出现帧中像素值为第一像素值的所有像素点的位置,确定运动物体的预测出现区域;
    所述根据所述事件消失帧确定运动物体的预测消失区域,包括:根据所述事件消失帧中像素值为第一像素值的所有像素点的位置,确定运动物体的预测消失区域。
  7. 根据权利要求1所述的方法,其特征在于,所述根据所述预测位置区域,确定所述图像信息中与所述预测位置区域匹配的定位区域,包括:
    获取所述动态视觉传感器和所述目标摄像组件的分辨率之间的比例关系;
    根据所述比例关系对所述预测位置区域进行缩放处理;
    将经缩放处理后的所述预测位置区域映射到所述图像信息中,以确定出匹配的所述定位区域。
  8. 根据权利要求1所述的方法,其特征在于,在所述根据所述预测位置区域,确定所述图像信息中与所述预测位置区域匹配的定位区域之后,所述方法还包括:
    根据预训练完成的图像分类模型,对所述定位区域进行识别及分类处理,以确定所述图像信息中是否存在运动物体。
  9. 根据权利要求8所述的方法,其特征在于,在所述根据预训练完成的图像分类模型,对所述定位区域进行识别及分类处理之前,所述方法还包括:
    判断所述定位区域中像素点的数量是否大于预设检测阈值;
    所述根据预训练完成的图像分类模型,对所述定位区域进行识别及分类处理,包括:
    若所述定位区域中像素点的数量大于预设检测阈值,则根据预训练完成的图像分类模型,对所述定位区域进行识别及分类处理。
  10. 根据权利要求9所述的方法,其特征在于,在所述根据预训练完成的图像分类模型,对所述定位区域进行识别及分类处理之前,所述方法还包括:
    获取样本图像集合,并通过所述样本图像集合对所述图像分类模型进行图像分类训练,以获取预训练完成的所述图像分类模型;其中,所述图像分类模型基于神经网络构建。
  11. 一种运动物体的定位装置,其特征在于,包括:
    信息获取模块,用于通过动态视觉传感器获取事件流信息,以及通过目标摄像组件获取图像信息;
    采样执行模块,用于根据预设采样周期对所述事件流信息进行采样,以获取采样事件帧,并根据所述采样事件帧对应的事件流信息,确定所述采样事件帧中运动物体的预测位置区域;
    分类执行模块,用于根据所述预测位置区域,确定所述图像信息中与所述预测位置区域匹配的定位区域。
  12. 一种电子设备,其特征在于,所述电子设备包括:
    一个或多个处理器;
    存储器,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-10中任一项所述的运动物体的定位方法。
  13. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1-10中任一项所述的运动物体的定位方法。
PCT/CN2021/140765 2020-12-24 2021-12-23 运动物体的定位方法、装置、电子设备及存储介质 WO2022135511A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011552412.8 2020-12-24
CN202011552412.8A CN112669344B (zh) 2020-12-24 2020-12-24 一种运动物体的定位方法、装置、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022135511A1 true WO2022135511A1 (zh) 2022-06-30

Family

ID=75410041

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/140765 WO2022135511A1 (zh) 2020-12-24 2021-12-23 运动物体的定位方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN112669344B (zh)
WO (1) WO2022135511A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116416602A (zh) * 2023-04-17 2023-07-11 江南大学 基于事件数据与图像数据联合的运动目标检测方法及***

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110909047B (zh) * 2019-11-28 2022-05-17 大连海事大学 一种面向指定时刻的日常行为识别方法
CN112669344B (zh) * 2020-12-24 2024-05-28 北京灵汐科技有限公司 一种运动物体的定位方法、装置、电子设备及存储介质
CN113096158A (zh) * 2021-05-08 2021-07-09 北京灵汐科技有限公司 运动对象的识别方法、装置、电子设备及可读存储介质
WO2023279286A1 (en) * 2021-07-07 2023-01-12 Harman International Industries, Incorporated Method and system for auto-labeling dvs frames
CN113506321A (zh) * 2021-07-15 2021-10-15 清华大学 图像处理方法及装置、电子设备和存储介质
CN114140365B (zh) * 2022-01-27 2022-07-22 荣耀终端有限公司 基于事件帧的特征点匹配方法及电子设备
CN114549442B (zh) * 2022-02-14 2022-09-20 常州市新创智能科技有限公司 一种运动物体的实时监测方法、装置、设备及存储介质
CN114677443B (zh) * 2022-05-27 2022-08-19 深圳智华科技发展有限公司 光学定位方法、装置、设备及存储介质
CN116055844B (zh) * 2023-01-28 2024-05-31 荣耀终端有限公司 一种跟踪对焦方法、电子设备和计算机可读存储介质
CN117975920A (zh) * 2024-03-28 2024-05-03 深圳市戴乐体感科技有限公司 一种鼓槌动态识别定位方法、装置、设备及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160210513A1 (en) * 2015-01-15 2016-07-21 Samsung Electronics Co., Ltd. Object recognition method and apparatus
CN107123131A (zh) * 2017-04-10 2017-09-01 安徽清新互联信息科技有限公司 一种基于深度学习的运动目标检测方法
CN110660088A (zh) * 2018-06-30 2020-01-07 华为技术有限公司 一种图像处理的方法和设备
US20200011668A1 (en) * 2018-07-09 2020-01-09 Samsung Electronics Co., Ltd. Simultaneous location and mapping (slam) using dual event cameras
CN111831119A (zh) * 2020-07-10 2020-10-27 Oppo广东移动通信有限公司 眼球追踪方法、装置、存储介质及头戴式显示设备
CN111951313A (zh) * 2020-08-06 2020-11-17 北京灵汐科技有限公司 图像配准方法、装置、设备及介质
CN112669344A (zh) * 2020-12-24 2021-04-16 北京灵汐科技有限公司 一种运动物体的定位方法、装置、电子设备及存储介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105844128B (zh) * 2015-01-15 2021-03-02 北京三星通信技术研究有限公司 身份识别方法和装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160210513A1 (en) * 2015-01-15 2016-07-21 Samsung Electronics Co., Ltd. Object recognition method and apparatus
CN107123131A (zh) * 2017-04-10 2017-09-01 安徽清新互联信息科技有限公司 一种基于深度学习的运动目标检测方法
CN110660088A (zh) * 2018-06-30 2020-01-07 华为技术有限公司 一种图像处理的方法和设备
US20200011668A1 (en) * 2018-07-09 2020-01-09 Samsung Electronics Co., Ltd. Simultaneous location and mapping (slam) using dual event cameras
CN111831119A (zh) * 2020-07-10 2020-10-27 Oppo广东移动通信有限公司 眼球追踪方法、装置、存储介质及头戴式显示设备
CN111951313A (zh) * 2020-08-06 2020-11-17 北京灵汐科技有限公司 图像配准方法、装置、设备及介质
CN112669344A (zh) * 2020-12-24 2021-04-16 北京灵汐科技有限公司 一种运动物体的定位方法、装置、电子设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116416602A (zh) * 2023-04-17 2023-07-11 江南大学 基于事件数据与图像数据联合的运动目标检测方法及***
CN116416602B (zh) * 2023-04-17 2024-05-24 江南大学 基于事件数据与图像数据联合的运动目标检测方法及***

Also Published As

Publication number Publication date
CN112669344B (zh) 2024-05-28
CN112669344A (zh) 2021-04-16

Similar Documents

Publication Publication Date Title
WO2022135511A1 (zh) 运动物体的定位方法、装置、电子设备及存储介质
US11643076B2 (en) Forward collision control method and apparatus, electronic device, program, and medium
US7944454B2 (en) System and method for user monitoring interface of 3-D video streams from multiple cameras
US7787011B2 (en) System and method for analyzing and monitoring 3-D video streams from multiple cameras
US20180151063A1 (en) Real-time detection system for parked vehicles
US10242294B2 (en) Target object classification using three-dimensional geometric filtering
US10685263B2 (en) System and method for object labeling
CN112800860B (zh) 一种事件相机和视觉相机协同的高速抛撒物检测方法和***
WO2022199360A1 (zh) 运动物体的定位方法、装置、电子设备及存储介质
WO2021031954A1 (zh) 对象数量确定方法、装置、存储介质与电子设备
JP7272024B2 (ja) 物体追跡装置、監視システムおよび物体追跡方法
US20210124928A1 (en) Object tracking methods and apparatuses, electronic devices and storage media
Benito-Picazo et al. Deep learning-based video surveillance system managed by low cost hardware and panoramic cameras
US20210342593A1 (en) Method and apparatus for detecting target in video, computing device, and storage medium
TWI726278B (zh) 行車偵測方法、車輛及行車處理裝置
Liu et al. A cloud infrastructure for target detection and tracking using audio and video fusion
CN108229281B (zh) 神经网络的生成方法和人脸检测方法、装置及电子设备
CN115359406A (zh) 一种邮局场景人物交互行为识别方法及***
CN113076889B (zh) 集装箱铅封识别方法、装置、电子设备和存储介质
Zhou et al. A kinematic analysis-based on-line fingerlings counting method using low-frame-rate camera
CN112541403A (zh) 一种利用红外摄像头的室内人员跌倒检测方法
US10916016B2 (en) Image processing apparatus and method and monitoring system
CN113762027B (zh) 一种异常行为的识别方法、装置、设备及存储介质
CN113658223A (zh) 一种基于深度学习的多行人检测与跟踪方法及***
Osborne et al. Temporally stable feature clusters for maritime object tracking in visible and thermal imagery

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21909506

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02.11.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21909506

Country of ref document: EP

Kind code of ref document: A1