CN110287826A - A kind of video object detection method based on attention mechanism - Google Patents

A kind of video object detection method based on attention mechanism Download PDF

Info

Publication number
CN110287826A
CN110287826A CN201910499786.9A CN201910499786A CN110287826A CN 110287826 A CN110287826 A CN 110287826A CN 201910499786 A CN201910499786 A CN 201910499786A CN 110287826 A CN110287826 A CN 110287826A
Authority
CN
China
Prior art keywords
detected
frame
characteristic pattern
feature
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910499786.9A
Other languages
Chinese (zh)
Other versions
CN110287826B (en
Inventor
李建强
白骏
刘雅琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201910499786.9A priority Critical patent/CN110287826B/en
Publication of CN110287826A publication Critical patent/CN110287826A/en
Application granted granted Critical
Publication of CN110287826B publication Critical patent/CN110287826B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to a kind of video object detection methods based on attention mechanism, are related to computer vision.The present invention includes the following steps: step S1, extracts the candidate feature figure of current time frame;Step S2, fusion window is set in time in the past section, Laplce's variance of each frame in calculation window, using normalized square mean as the weight of frame each in window, the candidate feature figure of frames all in window is weighted summation and obtains temporal aspect, the candidate feature of current time frame is connected with temporal aspect, obtains characteristic pattern to be detected;Step S3 extracts the characteristic pattern of additional scale using convolutional layer on characteristic pattern to be detected;Step S4 carries out target category and position prediction using convolutional layer on the characteristic pattern of different scale.Feature fusion of the invention is assigned with different weights to the frame feature of different quality in time in the past section, so that the fusion of timing information is more abundant, improves the performance of detection model.

Description

A kind of video object detection method based on attention mechanism
Technical field
The present invention relates to computer visions, are related to deep learning, are related to video object detection technique.
Background technique
Image object detection method based on deep learning achieves huge progress, such as RCNN in past quinquenniad Series of network, SSD network and YOLO series of network.But in fields such as video monitoring, vehicle assistant drives, the mesh based on video Mark detection has more extensive demand.It since there are motion blurs in video, blocks, metamorphosis diversity, illumination variation The problems such as diversity, can not obtain good testing result merely with the target in image object detection technique detection video. There is continuity in video between consecutive frame and frame in time, spatially there is similitude, the position of target is between frame and frame It is associated, how to become the key for promoting video object detection performance using Goal time order information in video.
Current video object detection framework mainly has three classes: a kind of that video frame is considered as independent image using image mesh Mark detection algorithm is detected, and such methods have ignored temporal information and independently detect to each frame, and the effect is unsatisfactory;Separately A kind of method combines target detection with target following technology, such methods post-processed in the result of detection so as to Track target, the precision of tracking easily cause error propagation dependent on detection;There is a method in which only being examined on a small number of key frames It surveys, the feature of remaining frame is then generated using Optic flow information and key frame feature, although timing information is utilized in such methods But the calculating cost of light stream is very big, it is difficult to be used for quickly detecting.
Summary of the invention
The object of the present invention is to provide a kind of sufficiently fusion temporal aspects, fast and accurately video object detection method.
In order to solve the above technical problems, the present invention provides a kind of video object detection method based on attention mechanism, Include the following steps:
Step S1 extracts the video frame images input Mobilenet network of current point in time to obtain candidate feature figure;
Step S2 sets a temporal aspect in the time in the past section adjacent with current point in time and merges window, to spy Sign fusion window in video frame to be fused, calculate separately its image Laplce's variance, after being normalized, as respectively to The candidate feature figure of all frames to be fused is weighted summation according to fusion weight and obtains present frame by the fusion weight for merging frame The candidate feature of current time step video frame is connected with temporal aspect in the channel of feature dimension, obtains by required temporal aspect To the characteristic pattern to be detected for having merged timing information;
Step S3 extracts additional scale using convolution feature extraction layer and maximum pond layer on characteristic pattern to be detected Characteristic pattern to be detected;
Step S4, on the characteristic pattern to be detected of different scale, using convolutional layer carry out present frame on target category and The prediction of bounding box coordinates.
Further, in step S1, the video frame of current point in time t is detected, first by current point in time video frame Image ItIt inputs Mobilenet network and carries out feature extraction, whereinHIAnd WIThe respectively height of video frame And width, extraction obtain candidate feature figure Represent real number, C1, H1And W1Respectively candidate feature figure Feature port number, height and width.
Further, in step S2, a width w is set in the time in the past section of current point in time t as the Fusion Features of s Window enables the video frame images to be fused in Fusion Features window are as follows: { It-iI ∈ [1, s], it is to be fused in Fusion Features window The corresponding candidate feature figure of video frame are as follows: { Ft-iI ∈ [1, s].By each video frame images I to be fusedt-iBe converted to gray scale Scheme Gt-i, and on the basis of grayscale image calculate image Laplce's variance, the Laplce at grayscale image G coordinate (x, y) Operator isThe Laplace operator of image passes through the second dervative for calculating each pixel all directions of image, to catch The region jumpy of pixel value in image is caught, the corner in detection image, Laplce's variance of image then body can be used to Showed the pixel value situation of change of whole image, if Laplce's variance is larger, illustrated that image is more visible, on the contrary image compared with It is fuzzy.
Each grayscale image G is calculated firstt-iLaplce's mean valueHIAnd WIRespectively height and width of grayscale image:
Next each grayscale image G is calculatedt-iLaplce's variance
If video frame is more visible, candidate feature facilitates the detection of target, otherwise some frames are due to moving target Cause image fuzzy.The candidate feature of these frames, which is unfavorable for detection target, should distribute not the video frame of different readabilities Same fusion weight calculates all first so that detection model focuses more on clearly feature rather than fuzzy feature The fusion weight α of video frame to be fusedt-i:
It is merged the frame candidate feature in Fusion Features window to obtain current point in time in a manner of weighted sum Temporal aspectThe candidate feature of temporal aspect and present frame is attached in channel dimension, completes melting for timing information Close, obtain first for detection characteristic pattern to be detected.
Further, in step S3, in the characteristic pattern to be detected for obtaining current point in time and having merged temporal aspectAfterwards, it is The characteristic pattern to be detected of more scales is obtained, using 3 × 3 convolutional layers and 2 × 2 pond layers characteristic pattern to be detected is carried out into one Step feature extraction reduces the size of characteristic pattern to be detected simultaneously, and local message is more in this way in the big characteristic pattern to be detected of size It is abundant, it is suitble to predict small size target, the small characteristic pattern to be detected of size contains stronger global semantic information, is suitble to The biggish target of detecting size finally obtains e characteristic patterns to be detected by e-1 feature extraction:
Further, in step S4, by additional feature extraction, obtain multiple dimensioned characteristic pattern to be detected, by Setting has the anchor frame of priori position on the mapping to be checked of different scale, using two 3 × 3 convolutional layers in these features to be detected Object boundary frame is carried out respectively with respect to the offset of anchor frame and the classification of target using channel dimension on figure.Enable classification number be d (including Background), for each characteristic pattern to be detectedIt is predicted by 3 × 3 convolution classification prediction intervals and 3 × 3 convolution bounding box prediction intervals After obtain classification prediction resultAnd bounding box prediction result
Detailed description of the invention
Fig. 1 is schematic diagram of the present invention.
Specific embodiment
In conjunction with the accompanying drawings, the present invention is further explained in detail.These attached drawings are rough schematic view, only to show Meaning mode illustrates basic structure of the invention, therefore it only shows the composition relevant to the invention.
Embodiment 1
As shown in Figure 1, this example provides a kind of video object detection method based on attention mechanism, including walk as follows Suddenly
Step S1 extracts the video frame images input Mobilenet network of current point in time to obtain candidate feature figure;
Step S2 sets a temporal aspect in the time in the past section adjacent with current point in time and merges window, for Video frame to be fused in Fusion Features window calculates separately its image Laplce's variance, after being normalized, as each The candidate feature figure of all frames to be fused is weighted summation according to weight and obtains present frame institute by the fusion weight of frame to be fused The candidate feature of current time step video frame is connected with temporal aspect in channel dimension, is merged by the temporal aspect needed The characteristic pattern to be detected of timing information;
Step S3 extracts additional scale using convolution feature extraction layer and maximum pond layer on characteristic pattern to be detected Characteristic pattern to be detected;
Step S4, on the characteristic pattern to be detected of different scale, using convolutional layer carry out present frame on target category and The prediction of bounding box coordinates.
In the step S1, current point in time t video frame is detected current point in time video frame images I firstt It inputs Mobilenet and carries out feature extraction, whereinHIAnd WIIt is the height and width of frame image respectively, extracts To candidate feature figureWherein C1, H1, W1The respectively port number of candidate feature figure, height and width.
In the step S2, a width w is set in the time in the past section of current point in time t as the Fusion Features window of s Mouthful, enabling the length of time in the past section is q, then the setting rule of Fusion Features window width is shown below, i.e., if past Time step length is greater than s, then sets s for fusion window width, if time in the past step-length degree is less than s, not enough spies Fusion window width is then set as the past the length of time step by sign.
Enable the video frame images to be fused in Fusion Features window are as follows: { It-iI ∈ [1, s], wait melt in Fusion Features window Close the corresponding candidate feature figure of video frame are as follows: { Ft-iI ∈ [1, s].By each video frame images I to be fusedt-iBe converted to ash Degree figure Gt-i, and on the basis of grayscale image calculate image Laplce's variance, the La Pula at grayscale image G coordinate (x, y) This operator are as follows:
Wherein G (x, y) represents pixel value of the grayscale image G at coordinate (x, y).The Laplace operator of image passes through calculating The second dervative of each pixel all directions of image can be used to detect figure to capture the region jumpy of pixel value in image Corner as in, Laplce's variance of image then embodies the pixel value situation of change of whole image, if Laplce side Difference is larger, then illustrates that image is more visible, otherwise image is more fuzzy.
Each grayscale image G is calculated firstt-iLaplce's mean valueHIAnd WIThe respectively height and width of grayscale image.
Next each grayscale image G is calculatedt-iLaplce's variance
If video frame is more visible, candidate feature facilitates the detection of target, otherwise some frames are due to moving target Cause image fuzzy.The candidate feature of these frames, which is unfavorable for detection target, should distribute not the video frame of different readabilities With fusion weight, more clearly frame feature weight is bigger so that detection model focus more on clearly feature rather than Fuzzy feature calculates the fusion weight α of all video frames to be fused firstt-i:
It is merged the frame candidate feature in Fusion Features window to obtain current point in time in a manner of weighted sum Temporal aspect
The candidate feature of temporal aspect and present frame is attached in channel dimension, the fusion of timing information is completed, obtains First characteristic pattern to be detected for detection
In the step S3, in the characteristic pattern to be detected for obtaining current point in time and having merged temporal aspectIt afterwards, is terrible To more multiple dimensioned characteristic pattern to be detected, it is same that further feature extraction is carried out to characteristic pattern to be detected using convolutional layer and pond layer When reduce the size of characteristic pattern to be detected, local message is suitble to pair compared with horn of plenty in this way in the big characteristic pattern to be detected of size Small size target predicted, the small characteristic pattern to be detected of size contains stronger global semantic information, be suitble to detecting size compared with Big target finally obtains e characteristic patterns to be detected by e-1 feature extraction:
In the step S4, by additional feature extraction, multiple dimensioned characteristic pattern to be detected is obtained, by difference Setting has the anchor frame of priori position on the mapping to be checked of scale, is utilized on these characteristic patterns to be detected using two convolutional layers Channel dimension carries out object boundary frame with respect to the offset of anchor frame and the classification of target respectively.Enabling classification number is d (including background), right In each characteristic pattern to be detected Wherein CFi, HFi, WFiRespectively the port number of this feature figure, Height and width, the anchor frame number of each location of pixels are ni, after convolution classification prediction interval and the prediction of convolution bounding box prediction interval To classification prediction resultAnd bounding box prediction result

Claims (5)

1. a kind of video object detection method based on attention mechanism, which comprises the steps of:
Step S1 extracts the video frame images input Mobilenet of current point in time to obtain candidate feature figure;
Step S2 sets a temporal aspect in the time in the past section adjacent with current point in time and merges window, for feature The video frame to be fused in window is merged, its image Laplce's variance is calculated separately, after being normalized, as respectively wait melt The candidate feature figure of all frames to be fused is weighted summation according to weight and obtained needed for present frame by the fusion weight for closing frame The candidate feature of current time step video frame is connected with temporal aspect in channel dimension, has been merged timing by temporal aspect The characteristic pattern to be detected of information;
Step S3, using convolution feature extraction layer and maximum pond layer extracted on characteristic pattern to be detected additional scale to Detect characteristic pattern;
Step S4 carries out target category and boundary on present frame using convolutional layer on the characteristic pattern to be detected of different scale The prediction of frame coordinate.
2. the video object detection method according to claim 1 based on attention mechanism, which is characterized in that
In the step S1, current point in time t video frame is detected current point in time video frame images I firsttInput Mobilenet network carries out feature extraction and obtains candidate feature figure Ft;WhereinHIAnd WIRespectively video The height and width of frame, extraction obtain candidate feature figure Represent real number, C1, H1And W1It is respectively candidate special Levy feature port number, the height and width of figure.
3. the video object detection method according to claim 2 based on attention mechanism, which is characterized in that
In the step S2, a width w is set in the time in the past section of current point in time t as the Fusion Features window of s, is enabled Video frame images to be fused in Fusion Features window are as follows: { It-iI ∈ [1, s], video frame pair to be fused in Fusion Features window The candidate feature figure answered are as follows: { Ft-iI ∈ [1, s];By each video frame images I to be fusedt-iBe converted to grayscale image Gt-i
Calculate each grayscale image Gt-iLaplce's varianceIt is calculated by normalization Laplce's variance all to be fused The fusion weight α of video framet-i;Frame candidate feature in Fusion Features window is merged to obtain in a manner of weighted sum The temporal aspect of current point in timeThe candidate feature of temporal aspect and present frame is attached in channel dimension, when completion The fusion of sequence information, obtain first for detection characteristic pattern to be detected
4. the video object detection method according to claim 3 based on attention mechanism, which is characterized in that
In the step S3, in the characteristic pattern to be detected for obtaining current point in time and having merged temporal aspectAfterwards, volume 3 × 3 are utilized Lamination and 2 × 2 pond layers carry out further feature extraction to characteristic pattern to be detected while reducing the size of characteristic pattern to be detected, inspection Characteristic pattern is surveyed to carry out further feature extraction while reducing the size of characteristic pattern to be detected, by e-1 feature extraction, final To e characteristic patterns to be detected:
5. the video object detection method according to claim 4 based on attention mechanism, which is characterized in that
In the step S4, by additional feature extraction, multiple dimensioned characteristic pattern to be detected is obtained, by different scale Mapping to be checked on setting have priori position anchor frame, utilized on these characteristic patterns to be detected using two 3 × 3 convolutional layers Channel dimension carries out object boundary frame with respect to the offset of anchor frame and the classification of target respectively;By 3 × 3 convolution classification prediction intervals and 3 × 3 convolution bounding box prediction intervals are for each characteristic pattern to be detectedIt is predicted by convolution classification prediction interval and convolution bounding box Classification prediction result is obtained after layer predictionAnd bounding box prediction result
CN201910499786.9A 2019-06-11 2019-06-11 Video target detection method based on attention mechanism Active CN110287826B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910499786.9A CN110287826B (en) 2019-06-11 2019-06-11 Video target detection method based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910499786.9A CN110287826B (en) 2019-06-11 2019-06-11 Video target detection method based on attention mechanism

Publications (2)

Publication Number Publication Date
CN110287826A true CN110287826A (en) 2019-09-27
CN110287826B CN110287826B (en) 2021-09-17

Family

ID=68003699

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910499786.9A Active CN110287826B (en) 2019-06-11 2019-06-11 Video target detection method based on attention mechanism

Country Status (1)

Country Link
CN (1) CN110287826B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110674886A (en) * 2019-10-08 2020-01-10 中兴飞流信息科技有限公司 Video target detection method fusing multi-level features
CN110751646A (en) * 2019-10-28 2020-02-04 支付宝(杭州)信息技术有限公司 Method and device for identifying damage by using multiple image frames in vehicle video
CN111310609A (en) * 2020-01-22 2020-06-19 西安电子科技大学 Video target detection method based on time sequence information and local feature similarity
CN112016472A (en) * 2020-08-31 2020-12-01 山东大学 Driver attention area prediction method and system based on target dynamic information
CN112434607A (en) * 2020-11-24 2021-03-02 北京奇艺世纪科技有限公司 Feature processing method and device, electronic equipment and computer-readable storage medium
CN112561001A (en) * 2021-02-22 2021-03-26 南京智莲森信息技术有限公司 Video target detection method based on space-time feature deformable convolution fusion
CN112686913A (en) * 2021-01-11 2021-04-20 天津大学 Object boundary detection and object segmentation model based on boundary attention consistency
CN113688801A (en) * 2021-10-22 2021-11-23 南京智谱科技有限公司 Chemical gas leakage detection method and system based on spectrum video
WO2022036567A1 (en) * 2020-08-18 2022-02-24 深圳市大疆创新科技有限公司 Target detection method and device, and vehicle-mounted radar
CN114594770A (en) * 2022-03-04 2022-06-07 深圳市千乘机器人有限公司 Inspection method for inspection robot without stopping
CN115131710A (en) * 2022-07-05 2022-09-30 福州大学 Real-time action detection method based on multi-scale feature fusion attention

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102393958A (en) * 2011-07-16 2012-03-28 西安电子科技大学 Multi-focus image fusion method based on compressive sensing
CN103152513A (en) * 2011-12-06 2013-06-12 瑞昱半导体股份有限公司 Image processing method and relative image processing device
CN103702032A (en) * 2013-12-31 2014-04-02 华为技术有限公司 Image processing method, device and terminal equipment
CN105913404A (en) * 2016-07-01 2016-08-31 湖南源信光电科技有限公司 Low-illumination imaging method based on frame accumulation
US20170127016A1 (en) * 2015-10-29 2017-05-04 Baidu Usa Llc Systems and methods for video paragraph captioning using hierarchical recurrent neural networks
CN107481238A (en) * 2017-09-20 2017-12-15 众安信息技术服务有限公司 Image quality measure method and device
US20180060666A1 (en) * 2016-08-29 2018-03-01 Nec Laboratories America, Inc. Video system using dual stage attention based recurrent neural network for future event prediction
CN108921803A (en) * 2018-06-29 2018-11-30 华中科技大学 A kind of defogging method based on millimeter wave and visual image fusion
CN109104568A (en) * 2018-07-24 2018-12-28 苏州佳世达光电有限公司 The intelligent cleaning driving method and drive system of monitoring camera
CN109684912A (en) * 2018-11-09 2019-04-26 中国科学院计算技术研究所 A kind of video presentation method and system based on information loss function
CN109829398A (en) * 2019-01-16 2019-05-31 北京航空航天大学 A kind of object detection method in video based on Three dimensional convolution network

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102393958A (en) * 2011-07-16 2012-03-28 西安电子科技大学 Multi-focus image fusion method based on compressive sensing
CN103152513A (en) * 2011-12-06 2013-06-12 瑞昱半导体股份有限公司 Image processing method and relative image processing device
CN103702032A (en) * 2013-12-31 2014-04-02 华为技术有限公司 Image processing method, device and terminal equipment
US20170127016A1 (en) * 2015-10-29 2017-05-04 Baidu Usa Llc Systems and methods for video paragraph captioning using hierarchical recurrent neural networks
CN105913404A (en) * 2016-07-01 2016-08-31 湖南源信光电科技有限公司 Low-illumination imaging method based on frame accumulation
US20180060666A1 (en) * 2016-08-29 2018-03-01 Nec Laboratories America, Inc. Video system using dual stage attention based recurrent neural network for future event prediction
CN107481238A (en) * 2017-09-20 2017-12-15 众安信息技术服务有限公司 Image quality measure method and device
CN108921803A (en) * 2018-06-29 2018-11-30 华中科技大学 A kind of defogging method based on millimeter wave and visual image fusion
CN109104568A (en) * 2018-07-24 2018-12-28 苏州佳世达光电有限公司 The intelligent cleaning driving method and drive system of monitoring camera
CN109684912A (en) * 2018-11-09 2019-04-26 中国科学院计算技术研究所 A kind of video presentation method and system based on information loss function
CN109829398A (en) * 2019-01-16 2019-05-31 北京航空航天大学 A kind of object detection method in video based on Three dimensional convolution network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XIN WANG: "Infrared dim target detection based on visual attention", 《INFRARED PHYSICS & TECHNOLOGY》 *
王昕: "基于提升小波变换的图像清晰度评价算法", 《万方数据知识服务平台》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110674886A (en) * 2019-10-08 2020-01-10 中兴飞流信息科技有限公司 Video target detection method fusing multi-level features
CN110674886B (en) * 2019-10-08 2022-11-25 中兴飞流信息科技有限公司 Video target detection method fusing multi-level features
CN110751646A (en) * 2019-10-28 2020-02-04 支付宝(杭州)信息技术有限公司 Method and device for identifying damage by using multiple image frames in vehicle video
CN111310609A (en) * 2020-01-22 2020-06-19 西安电子科技大学 Video target detection method based on time sequence information and local feature similarity
WO2022036567A1 (en) * 2020-08-18 2022-02-24 深圳市大疆创新科技有限公司 Target detection method and device, and vehicle-mounted radar
CN112016472A (en) * 2020-08-31 2020-12-01 山东大学 Driver attention area prediction method and system based on target dynamic information
CN112016472B (en) * 2020-08-31 2023-08-22 山东大学 Driver attention area prediction method and system based on target dynamic information
CN112434607A (en) * 2020-11-24 2021-03-02 北京奇艺世纪科技有限公司 Feature processing method and device, electronic equipment and computer-readable storage medium
CN112434607B (en) * 2020-11-24 2023-05-26 北京奇艺世纪科技有限公司 Feature processing method, device, electronic equipment and computer readable storage medium
CN112686913A (en) * 2021-01-11 2021-04-20 天津大学 Object boundary detection and object segmentation model based on boundary attention consistency
CN112686913B (en) * 2021-01-11 2022-06-10 天津大学 Object boundary detection and object segmentation model based on boundary attention consistency
CN112561001A (en) * 2021-02-22 2021-03-26 南京智莲森信息技术有限公司 Video target detection method based on space-time feature deformable convolution fusion
CN113688801A (en) * 2021-10-22 2021-11-23 南京智谱科技有限公司 Chemical gas leakage detection method and system based on spectrum video
CN114594770A (en) * 2022-03-04 2022-06-07 深圳市千乘机器人有限公司 Inspection method for inspection robot without stopping
CN114594770B (en) * 2022-03-04 2024-04-26 深圳市千乘机器人有限公司 Inspection method for inspection robot without stopping
CN115131710A (en) * 2022-07-05 2022-09-30 福州大学 Real-time action detection method based on multi-scale feature fusion attention

Also Published As

Publication number Publication date
CN110287826B (en) 2021-09-17

Similar Documents

Publication Publication Date Title
CN110287826A (en) A kind of video object detection method based on attention mechanism
CN108596974B (en) Dynamic scene robot positioning and mapping system and method
Hoogendoorn et al. Extracting microscopic pedestrian characteristics from video data
JP5102410B2 (en) Moving body detection apparatus and moving body detection method
TWI393074B (en) Apparatus and method for moving object detection
CN102741884B (en) Moving body detecting device and moving body detection method
CN110175576A (en) A kind of driving vehicle visible detection method of combination laser point cloud data
CN109460753A (en) A method of detection over-water floats
CN106529419B (en) The object automatic testing method of saliency stacking-type polymerization
CN107886120A (en) Method and apparatus for target detection tracking
CN110033473A (en) Motion target tracking method based on template matching and depth sorting network
CN110415277A (en) Based on light stream and the multi-target tracking method of Kalman filtering, system, device
CN107784291A (en) target detection tracking method and device based on infrared video
CN108596055A (en) The airport target detection method of High spatial resolution remote sensing under a kind of complex background
CN103425967A (en) Pedestrian flow monitoring method based on pedestrian detection and tracking
CN105243356B (en) A kind of method and device that establishing pedestrian detection model and pedestrian detection method
CN110033475A (en) A kind of take photo by plane figure moving object segmentation and removing method that high-resolution texture generates
CN108648211A (en) A kind of small target detecting method, device, equipment and medium based on deep learning
WO2008020598A1 (en) Subject number detecting device and subject number detecting method
CN112836640A (en) Single-camera multi-target pedestrian tracking method
CN109191498A (en) Object detection method and system based on dynamic memory and motion perception
CN106504274A (en) A kind of visual tracking method and system based under infrared camera
CN106204633A (en) A kind of student trace method and apparatus based on computer vision
CN105809716A (en) Superpixel and three-dimensional self-organizing background subtraction algorithm-combined foreground extraction method
CN106156714A (en) The Human bodys' response method merged based on skeletal joint feature and surface character

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant