CN110287826A - A kind of video object detection method based on attention mechanism - Google Patents
A kind of video object detection method based on attention mechanism Download PDFInfo
- Publication number
- CN110287826A CN110287826A CN201910499786.9A CN201910499786A CN110287826A CN 110287826 A CN110287826 A CN 110287826A CN 201910499786 A CN201910499786 A CN 201910499786A CN 110287826 A CN110287826 A CN 110287826A
- Authority
- CN
- China
- Prior art keywords
- detected
- frame
- characteristic pattern
- feature
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The present invention relates to a kind of video object detection methods based on attention mechanism, are related to computer vision.The present invention includes the following steps: step S1, extracts the candidate feature figure of current time frame;Step S2, fusion window is set in time in the past section, Laplce's variance of each frame in calculation window, using normalized square mean as the weight of frame each in window, the candidate feature figure of frames all in window is weighted summation and obtains temporal aspect, the candidate feature of current time frame is connected with temporal aspect, obtains characteristic pattern to be detected;Step S3 extracts the characteristic pattern of additional scale using convolutional layer on characteristic pattern to be detected;Step S4 carries out target category and position prediction using convolutional layer on the characteristic pattern of different scale.Feature fusion of the invention is assigned with different weights to the frame feature of different quality in time in the past section, so that the fusion of timing information is more abundant, improves the performance of detection model.
Description
Technical field
The present invention relates to computer visions, are related to deep learning, are related to video object detection technique.
Background technique
Image object detection method based on deep learning achieves huge progress, such as RCNN in past quinquenniad
Series of network, SSD network and YOLO series of network.But in fields such as video monitoring, vehicle assistant drives, the mesh based on video
Mark detection has more extensive demand.It since there are motion blurs in video, blocks, metamorphosis diversity, illumination variation
The problems such as diversity, can not obtain good testing result merely with the target in image object detection technique detection video.
There is continuity in video between consecutive frame and frame in time, spatially there is similitude, the position of target is between frame and frame
It is associated, how to become the key for promoting video object detection performance using Goal time order information in video.
Current video object detection framework mainly has three classes: a kind of that video frame is considered as independent image using image mesh
Mark detection algorithm is detected, and such methods have ignored temporal information and independently detect to each frame, and the effect is unsatisfactory;Separately
A kind of method combines target detection with target following technology, such methods post-processed in the result of detection so as to
Track target, the precision of tracking easily cause error propagation dependent on detection;There is a method in which only being examined on a small number of key frames
It surveys, the feature of remaining frame is then generated using Optic flow information and key frame feature, although timing information is utilized in such methods
But the calculating cost of light stream is very big, it is difficult to be used for quickly detecting.
Summary of the invention
The object of the present invention is to provide a kind of sufficiently fusion temporal aspects, fast and accurately video object detection method.
In order to solve the above technical problems, the present invention provides a kind of video object detection method based on attention mechanism,
Include the following steps:
Step S1 extracts the video frame images input Mobilenet network of current point in time to obtain candidate feature figure;
Step S2 sets a temporal aspect in the time in the past section adjacent with current point in time and merges window, to spy
Sign fusion window in video frame to be fused, calculate separately its image Laplce's variance, after being normalized, as respectively to
The candidate feature figure of all frames to be fused is weighted summation according to fusion weight and obtains present frame by the fusion weight for merging frame
The candidate feature of current time step video frame is connected with temporal aspect in the channel of feature dimension, obtains by required temporal aspect
To the characteristic pattern to be detected for having merged timing information;
Step S3 extracts additional scale using convolution feature extraction layer and maximum pond layer on characteristic pattern to be detected
Characteristic pattern to be detected;
Step S4, on the characteristic pattern to be detected of different scale, using convolutional layer carry out present frame on target category and
The prediction of bounding box coordinates.
Further, in step S1, the video frame of current point in time t is detected, first by current point in time video frame
Image ItIt inputs Mobilenet network and carries out feature extraction, whereinHIAnd WIThe respectively height of video frame
And width, extraction obtain candidate feature figure Represent real number, C1, H1And W1Respectively candidate feature figure
Feature port number, height and width.
Further, in step S2, a width w is set in the time in the past section of current point in time t as the Fusion Features of s
Window enables the video frame images to be fused in Fusion Features window are as follows: { It-iI ∈ [1, s], it is to be fused in Fusion Features window
The corresponding candidate feature figure of video frame are as follows: { Ft-iI ∈ [1, s].By each video frame images I to be fusedt-iBe converted to gray scale
Scheme Gt-i, and on the basis of grayscale image calculate image Laplce's variance, the Laplce at grayscale image G coordinate (x, y)
Operator isThe Laplace operator of image passes through the second dervative for calculating each pixel all directions of image, to catch
The region jumpy of pixel value in image is caught, the corner in detection image, Laplce's variance of image then body can be used to
Showed the pixel value situation of change of whole image, if Laplce's variance is larger, illustrated that image is more visible, on the contrary image compared with
It is fuzzy.
Each grayscale image G is calculated firstt-iLaplce's mean valueHIAnd WIRespectively height and width of grayscale image:
Next each grayscale image G is calculatedt-iLaplce's variance
If video frame is more visible, candidate feature facilitates the detection of target, otherwise some frames are due to moving target
Cause image fuzzy.The candidate feature of these frames, which is unfavorable for detection target, should distribute not the video frame of different readabilities
Same fusion weight calculates all first so that detection model focuses more on clearly feature rather than fuzzy feature
The fusion weight α of video frame to be fusedt-i:
It is merged the frame candidate feature in Fusion Features window to obtain current point in time in a manner of weighted sum
Temporal aspectThe candidate feature of temporal aspect and present frame is attached in channel dimension, completes melting for timing information
Close, obtain first for detection characteristic pattern to be detected.
Further, in step S3, in the characteristic pattern to be detected for obtaining current point in time and having merged temporal aspectAfterwards, it is
The characteristic pattern to be detected of more scales is obtained, using 3 × 3 convolutional layers and 2 × 2 pond layers characteristic pattern to be detected is carried out into one
Step feature extraction reduces the size of characteristic pattern to be detected simultaneously, and local message is more in this way in the big characteristic pattern to be detected of size
It is abundant, it is suitble to predict small size target, the small characteristic pattern to be detected of size contains stronger global semantic information, is suitble to
The biggish target of detecting size finally obtains e characteristic patterns to be detected by e-1 feature extraction:
Further, in step S4, by additional feature extraction, obtain multiple dimensioned characteristic pattern to be detected, by
Setting has the anchor frame of priori position on the mapping to be checked of different scale, using two 3 × 3 convolutional layers in these features to be detected
Object boundary frame is carried out respectively with respect to the offset of anchor frame and the classification of target using channel dimension on figure.Enable classification number be d (including
Background), for each characteristic pattern to be detectedIt is predicted by 3 × 3 convolution classification prediction intervals and 3 × 3 convolution bounding box prediction intervals
After obtain classification prediction resultAnd bounding box prediction result
Detailed description of the invention
Fig. 1 is schematic diagram of the present invention.
Specific embodiment
In conjunction with the accompanying drawings, the present invention is further explained in detail.These attached drawings are rough schematic view, only to show
Meaning mode illustrates basic structure of the invention, therefore it only shows the composition relevant to the invention.
Embodiment 1
As shown in Figure 1, this example provides a kind of video object detection method based on attention mechanism, including walk as follows
Suddenly
Step S1 extracts the video frame images input Mobilenet network of current point in time to obtain candidate feature figure;
Step S2 sets a temporal aspect in the time in the past section adjacent with current point in time and merges window, for
Video frame to be fused in Fusion Features window calculates separately its image Laplce's variance, after being normalized, as each
The candidate feature figure of all frames to be fused is weighted summation according to weight and obtains present frame institute by the fusion weight of frame to be fused
The candidate feature of current time step video frame is connected with temporal aspect in channel dimension, is merged by the temporal aspect needed
The characteristic pattern to be detected of timing information;
Step S3 extracts additional scale using convolution feature extraction layer and maximum pond layer on characteristic pattern to be detected
Characteristic pattern to be detected;
Step S4, on the characteristic pattern to be detected of different scale, using convolutional layer carry out present frame on target category and
The prediction of bounding box coordinates.
In the step S1, current point in time t video frame is detected current point in time video frame images I firstt
It inputs Mobilenet and carries out feature extraction, whereinHIAnd WIIt is the height and width of frame image respectively, extracts
To candidate feature figureWherein C1, H1, W1The respectively port number of candidate feature figure, height and width.
In the step S2, a width w is set in the time in the past section of current point in time t as the Fusion Features window of s
Mouthful, enabling the length of time in the past section is q, then the setting rule of Fusion Features window width is shown below, i.e., if past
Time step length is greater than s, then sets s for fusion window width, if time in the past step-length degree is less than s, not enough spies
Fusion window width is then set as the past the length of time step by sign.
Enable the video frame images to be fused in Fusion Features window are as follows: { It-iI ∈ [1, s], wait melt in Fusion Features window
Close the corresponding candidate feature figure of video frame are as follows: { Ft-iI ∈ [1, s].By each video frame images I to be fusedt-iBe converted to ash
Degree figure Gt-i, and on the basis of grayscale image calculate image Laplce's variance, the La Pula at grayscale image G coordinate (x, y)
This operator are as follows:
Wherein G (x, y) represents pixel value of the grayscale image G at coordinate (x, y).The Laplace operator of image passes through calculating
The second dervative of each pixel all directions of image can be used to detect figure to capture the region jumpy of pixel value in image
Corner as in, Laplce's variance of image then embodies the pixel value situation of change of whole image, if Laplce side
Difference is larger, then illustrates that image is more visible, otherwise image is more fuzzy.
Each grayscale image G is calculated firstt-iLaplce's mean valueHIAnd WIThe respectively height and width of grayscale image.
Next each grayscale image G is calculatedt-iLaplce's variance
If video frame is more visible, candidate feature facilitates the detection of target, otherwise some frames are due to moving target
Cause image fuzzy.The candidate feature of these frames, which is unfavorable for detection target, should distribute not the video frame of different readabilities
With fusion weight, more clearly frame feature weight is bigger so that detection model focus more on clearly feature rather than
Fuzzy feature calculates the fusion weight α of all video frames to be fused firstt-i:
It is merged the frame candidate feature in Fusion Features window to obtain current point in time in a manner of weighted sum
Temporal aspect
The candidate feature of temporal aspect and present frame is attached in channel dimension, the fusion of timing information is completed, obtains
First characteristic pattern to be detected for detection
In the step S3, in the characteristic pattern to be detected for obtaining current point in time and having merged temporal aspectIt afterwards, is terrible
To more multiple dimensioned characteristic pattern to be detected, it is same that further feature extraction is carried out to characteristic pattern to be detected using convolutional layer and pond layer
When reduce the size of characteristic pattern to be detected, local message is suitble to pair compared with horn of plenty in this way in the big characteristic pattern to be detected of size
Small size target predicted, the small characteristic pattern to be detected of size contains stronger global semantic information, be suitble to detecting size compared with
Big target finally obtains e characteristic patterns to be detected by e-1 feature extraction:
In the step S4, by additional feature extraction, multiple dimensioned characteristic pattern to be detected is obtained, by difference
Setting has the anchor frame of priori position on the mapping to be checked of scale, is utilized on these characteristic patterns to be detected using two convolutional layers
Channel dimension carries out object boundary frame with respect to the offset of anchor frame and the classification of target respectively.Enabling classification number is d (including background), right
In each characteristic pattern to be detected Wherein CFi, HFi, WFiRespectively the port number of this feature figure,
Height and width, the anchor frame number of each location of pixels are ni, after convolution classification prediction interval and the prediction of convolution bounding box prediction interval
To classification prediction resultAnd bounding box prediction result
Claims (5)
1. a kind of video object detection method based on attention mechanism, which comprises the steps of:
Step S1 extracts the video frame images input Mobilenet of current point in time to obtain candidate feature figure;
Step S2 sets a temporal aspect in the time in the past section adjacent with current point in time and merges window, for feature
The video frame to be fused in window is merged, its image Laplce's variance is calculated separately, after being normalized, as respectively wait melt
The candidate feature figure of all frames to be fused is weighted summation according to weight and obtained needed for present frame by the fusion weight for closing frame
The candidate feature of current time step video frame is connected with temporal aspect in channel dimension, has been merged timing by temporal aspect
The characteristic pattern to be detected of information;
Step S3, using convolution feature extraction layer and maximum pond layer extracted on characteristic pattern to be detected additional scale to
Detect characteristic pattern;
Step S4 carries out target category and boundary on present frame using convolutional layer on the characteristic pattern to be detected of different scale
The prediction of frame coordinate.
2. the video object detection method according to claim 1 based on attention mechanism, which is characterized in that
In the step S1, current point in time t video frame is detected current point in time video frame images I firsttInput
Mobilenet network carries out feature extraction and obtains candidate feature figure Ft;WhereinHIAnd WIRespectively video
The height and width of frame, extraction obtain candidate feature figure Represent real number, C1, H1And W1It is respectively candidate special
Levy feature port number, the height and width of figure.
3. the video object detection method according to claim 2 based on attention mechanism, which is characterized in that
In the step S2, a width w is set in the time in the past section of current point in time t as the Fusion Features window of s, is enabled
Video frame images to be fused in Fusion Features window are as follows: { It-iI ∈ [1, s], video frame pair to be fused in Fusion Features window
The candidate feature figure answered are as follows: { Ft-iI ∈ [1, s];By each video frame images I to be fusedt-iBe converted to grayscale image Gt-i;
Calculate each grayscale image Gt-iLaplce's varianceIt is calculated by normalization Laplce's variance all to be fused
The fusion weight α of video framet-i;Frame candidate feature in Fusion Features window is merged to obtain in a manner of weighted sum
The temporal aspect of current point in timeThe candidate feature of temporal aspect and present frame is attached in channel dimension, when completion
The fusion of sequence information, obtain first for detection characteristic pattern to be detected
4. the video object detection method according to claim 3 based on attention mechanism, which is characterized in that
In the step S3, in the characteristic pattern to be detected for obtaining current point in time and having merged temporal aspectAfterwards, volume 3 × 3 are utilized
Lamination and 2 × 2 pond layers carry out further feature extraction to characteristic pattern to be detected while reducing the size of characteristic pattern to be detected, inspection
Characteristic pattern is surveyed to carry out further feature extraction while reducing the size of characteristic pattern to be detected, by e-1 feature extraction, final
To e characteristic patterns to be detected:
5. the video object detection method according to claim 4 based on attention mechanism, which is characterized in that
In the step S4, by additional feature extraction, multiple dimensioned characteristic pattern to be detected is obtained, by different scale
Mapping to be checked on setting have priori position anchor frame, utilized on these characteristic patterns to be detected using two 3 × 3 convolutional layers
Channel dimension carries out object boundary frame with respect to the offset of anchor frame and the classification of target respectively;By 3 × 3 convolution classification prediction intervals and
3 × 3 convolution bounding box prediction intervals are for each characteristic pattern to be detectedIt is predicted by convolution classification prediction interval and convolution bounding box
Classification prediction result is obtained after layer predictionAnd bounding box prediction result
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910499786.9A CN110287826B (en) | 2019-06-11 | 2019-06-11 | Video target detection method based on attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910499786.9A CN110287826B (en) | 2019-06-11 | 2019-06-11 | Video target detection method based on attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110287826A true CN110287826A (en) | 2019-09-27 |
CN110287826B CN110287826B (en) | 2021-09-17 |
Family
ID=68003699
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910499786.9A Active CN110287826B (en) | 2019-06-11 | 2019-06-11 | Video target detection method based on attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110287826B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110674886A (en) * | 2019-10-08 | 2020-01-10 | 中兴飞流信息科技有限公司 | Video target detection method fusing multi-level features |
CN110751646A (en) * | 2019-10-28 | 2020-02-04 | 支付宝(杭州)信息技术有限公司 | Method and device for identifying damage by using multiple image frames in vehicle video |
CN111310609A (en) * | 2020-01-22 | 2020-06-19 | 西安电子科技大学 | Video target detection method based on time sequence information and local feature similarity |
CN112016472A (en) * | 2020-08-31 | 2020-12-01 | 山东大学 | Driver attention area prediction method and system based on target dynamic information |
CN112434607A (en) * | 2020-11-24 | 2021-03-02 | 北京奇艺世纪科技有限公司 | Feature processing method and device, electronic equipment and computer-readable storage medium |
CN112561001A (en) * | 2021-02-22 | 2021-03-26 | 南京智莲森信息技术有限公司 | Video target detection method based on space-time feature deformable convolution fusion |
CN112686913A (en) * | 2021-01-11 | 2021-04-20 | 天津大学 | Object boundary detection and object segmentation model based on boundary attention consistency |
CN113688801A (en) * | 2021-10-22 | 2021-11-23 | 南京智谱科技有限公司 | Chemical gas leakage detection method and system based on spectrum video |
WO2022036567A1 (en) * | 2020-08-18 | 2022-02-24 | 深圳市大疆创新科技有限公司 | Target detection method and device, and vehicle-mounted radar |
CN114594770A (en) * | 2022-03-04 | 2022-06-07 | 深圳市千乘机器人有限公司 | Inspection method for inspection robot without stopping |
CN115131710A (en) * | 2022-07-05 | 2022-09-30 | 福州大学 | Real-time action detection method based on multi-scale feature fusion attention |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102393958A (en) * | 2011-07-16 | 2012-03-28 | 西安电子科技大学 | Multi-focus image fusion method based on compressive sensing |
CN103152513A (en) * | 2011-12-06 | 2013-06-12 | 瑞昱半导体股份有限公司 | Image processing method and relative image processing device |
CN103702032A (en) * | 2013-12-31 | 2014-04-02 | 华为技术有限公司 | Image processing method, device and terminal equipment |
CN105913404A (en) * | 2016-07-01 | 2016-08-31 | 湖南源信光电科技有限公司 | Low-illumination imaging method based on frame accumulation |
US20170127016A1 (en) * | 2015-10-29 | 2017-05-04 | Baidu Usa Llc | Systems and methods for video paragraph captioning using hierarchical recurrent neural networks |
CN107481238A (en) * | 2017-09-20 | 2017-12-15 | 众安信息技术服务有限公司 | Image quality measure method and device |
US20180060666A1 (en) * | 2016-08-29 | 2018-03-01 | Nec Laboratories America, Inc. | Video system using dual stage attention based recurrent neural network for future event prediction |
CN108921803A (en) * | 2018-06-29 | 2018-11-30 | 华中科技大学 | A kind of defogging method based on millimeter wave and visual image fusion |
CN109104568A (en) * | 2018-07-24 | 2018-12-28 | 苏州佳世达光电有限公司 | The intelligent cleaning driving method and drive system of monitoring camera |
CN109684912A (en) * | 2018-11-09 | 2019-04-26 | 中国科学院计算技术研究所 | A kind of video presentation method and system based on information loss function |
CN109829398A (en) * | 2019-01-16 | 2019-05-31 | 北京航空航天大学 | A kind of object detection method in video based on Three dimensional convolution network |
-
2019
- 2019-06-11 CN CN201910499786.9A patent/CN110287826B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102393958A (en) * | 2011-07-16 | 2012-03-28 | 西安电子科技大学 | Multi-focus image fusion method based on compressive sensing |
CN103152513A (en) * | 2011-12-06 | 2013-06-12 | 瑞昱半导体股份有限公司 | Image processing method and relative image processing device |
CN103702032A (en) * | 2013-12-31 | 2014-04-02 | 华为技术有限公司 | Image processing method, device and terminal equipment |
US20170127016A1 (en) * | 2015-10-29 | 2017-05-04 | Baidu Usa Llc | Systems and methods for video paragraph captioning using hierarchical recurrent neural networks |
CN105913404A (en) * | 2016-07-01 | 2016-08-31 | 湖南源信光电科技有限公司 | Low-illumination imaging method based on frame accumulation |
US20180060666A1 (en) * | 2016-08-29 | 2018-03-01 | Nec Laboratories America, Inc. | Video system using dual stage attention based recurrent neural network for future event prediction |
CN107481238A (en) * | 2017-09-20 | 2017-12-15 | 众安信息技术服务有限公司 | Image quality measure method and device |
CN108921803A (en) * | 2018-06-29 | 2018-11-30 | 华中科技大学 | A kind of defogging method based on millimeter wave and visual image fusion |
CN109104568A (en) * | 2018-07-24 | 2018-12-28 | 苏州佳世达光电有限公司 | The intelligent cleaning driving method and drive system of monitoring camera |
CN109684912A (en) * | 2018-11-09 | 2019-04-26 | 中国科学院计算技术研究所 | A kind of video presentation method and system based on information loss function |
CN109829398A (en) * | 2019-01-16 | 2019-05-31 | 北京航空航天大学 | A kind of object detection method in video based on Three dimensional convolution network |
Non-Patent Citations (2)
Title |
---|
XIN WANG: "Infrared dim target detection based on visual attention", 《INFRARED PHYSICS & TECHNOLOGY》 * |
王昕: "基于提升小波变换的图像清晰度评价算法", 《万方数据知识服务平台》 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110674886A (en) * | 2019-10-08 | 2020-01-10 | 中兴飞流信息科技有限公司 | Video target detection method fusing multi-level features |
CN110674886B (en) * | 2019-10-08 | 2022-11-25 | 中兴飞流信息科技有限公司 | Video target detection method fusing multi-level features |
CN110751646A (en) * | 2019-10-28 | 2020-02-04 | 支付宝(杭州)信息技术有限公司 | Method and device for identifying damage by using multiple image frames in vehicle video |
CN111310609A (en) * | 2020-01-22 | 2020-06-19 | 西安电子科技大学 | Video target detection method based on time sequence information and local feature similarity |
WO2022036567A1 (en) * | 2020-08-18 | 2022-02-24 | 深圳市大疆创新科技有限公司 | Target detection method and device, and vehicle-mounted radar |
CN112016472A (en) * | 2020-08-31 | 2020-12-01 | 山东大学 | Driver attention area prediction method and system based on target dynamic information |
CN112016472B (en) * | 2020-08-31 | 2023-08-22 | 山东大学 | Driver attention area prediction method and system based on target dynamic information |
CN112434607A (en) * | 2020-11-24 | 2021-03-02 | 北京奇艺世纪科技有限公司 | Feature processing method and device, electronic equipment and computer-readable storage medium |
CN112434607B (en) * | 2020-11-24 | 2023-05-26 | 北京奇艺世纪科技有限公司 | Feature processing method, device, electronic equipment and computer readable storage medium |
CN112686913A (en) * | 2021-01-11 | 2021-04-20 | 天津大学 | Object boundary detection and object segmentation model based on boundary attention consistency |
CN112686913B (en) * | 2021-01-11 | 2022-06-10 | 天津大学 | Object boundary detection and object segmentation model based on boundary attention consistency |
CN112561001A (en) * | 2021-02-22 | 2021-03-26 | 南京智莲森信息技术有限公司 | Video target detection method based on space-time feature deformable convolution fusion |
CN113688801A (en) * | 2021-10-22 | 2021-11-23 | 南京智谱科技有限公司 | Chemical gas leakage detection method and system based on spectrum video |
CN114594770A (en) * | 2022-03-04 | 2022-06-07 | 深圳市千乘机器人有限公司 | Inspection method for inspection robot without stopping |
CN114594770B (en) * | 2022-03-04 | 2024-04-26 | 深圳市千乘机器人有限公司 | Inspection method for inspection robot without stopping |
CN115131710A (en) * | 2022-07-05 | 2022-09-30 | 福州大学 | Real-time action detection method based on multi-scale feature fusion attention |
Also Published As
Publication number | Publication date |
---|---|
CN110287826B (en) | 2021-09-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110287826A (en) | A kind of video object detection method based on attention mechanism | |
CN108596974B (en) | Dynamic scene robot positioning and mapping system and method | |
Hoogendoorn et al. | Extracting microscopic pedestrian characteristics from video data | |
JP5102410B2 (en) | Moving body detection apparatus and moving body detection method | |
TWI393074B (en) | Apparatus and method for moving object detection | |
CN102741884B (en) | Moving body detecting device and moving body detection method | |
CN110175576A (en) | A kind of driving vehicle visible detection method of combination laser point cloud data | |
CN109460753A (en) | A method of detection over-water floats | |
CN106529419B (en) | The object automatic testing method of saliency stacking-type polymerization | |
CN107886120A (en) | Method and apparatus for target detection tracking | |
CN110033473A (en) | Motion target tracking method based on template matching and depth sorting network | |
CN110415277A (en) | Based on light stream and the multi-target tracking method of Kalman filtering, system, device | |
CN107784291A (en) | target detection tracking method and device based on infrared video | |
CN108596055A (en) | The airport target detection method of High spatial resolution remote sensing under a kind of complex background | |
CN103425967A (en) | Pedestrian flow monitoring method based on pedestrian detection and tracking | |
CN105243356B (en) | A kind of method and device that establishing pedestrian detection model and pedestrian detection method | |
CN110033475A (en) | A kind of take photo by plane figure moving object segmentation and removing method that high-resolution texture generates | |
CN108648211A (en) | A kind of small target detecting method, device, equipment and medium based on deep learning | |
WO2008020598A1 (en) | Subject number detecting device and subject number detecting method | |
CN112836640A (en) | Single-camera multi-target pedestrian tracking method | |
CN109191498A (en) | Object detection method and system based on dynamic memory and motion perception | |
CN106504274A (en) | A kind of visual tracking method and system based under infrared camera | |
CN106204633A (en) | A kind of student trace method and apparatus based on computer vision | |
CN105809716A (en) | Superpixel and three-dimensional self-organizing background subtraction algorithm-combined foreground extraction method | |
CN106156714A (en) | The Human bodys' response method merged based on skeletal joint feature and surface character |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |