CN112686928A - Moving target visual tracking method based on multi-source information fusion - Google Patents

Moving target visual tracking method based on multi-source information fusion Download PDF

Info

Publication number
CN112686928A
CN112686928A CN202110015551.5A CN202110015551A CN112686928A CN 112686928 A CN112686928 A CN 112686928A CN 202110015551 A CN202110015551 A CN 202110015551A CN 112686928 A CN112686928 A CN 112686928A
Authority
CN
China
Prior art keywords
event
frame
domain
information
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110015551.5A
Other languages
Chinese (zh)
Other versions
CN112686928B (en
Inventor
傅应锴
杨鑫
张吉庆
尹宝才
魏小鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN202110015551.5A priority Critical patent/CN112686928B/en
Publication of CN112686928A publication Critical patent/CN112686928A/en
Application granted granted Critical
Publication of CN112686928B publication Critical patent/CN112686928B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of computer vision, and provides a moving target visual tracking method based on multi-source information fusion. Aiming at a moving target visual tracking task under the scenes of rapid movement and poor illumination, the invention firstly makes a moving target tracking data set based on an event camera, and simultaneously provides a visual target tracking algorithm based on cross-domain attention for accurately tracking a visual target based on the data set. The invention can take advantage of the respective advantages of combining the frame image and the event data: the frame image can provide rich texture information, and the event data can still provide clear object edge information in a challenging scene. By respectively setting the weights of the two kinds of domain information in different scenes, the method can effectively integrate the advantages of the two kinds of sensors so as to solve the problem of target tracking under complex conditions.

Description

Moving target visual tracking method based on multi-source information fusion
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a method for visually tracking a moving target by utilizing a frame image and an event stream output by an event camera based on deep learning.
Background
Moving object tracking is an important topic in computer vision that requires tracking of an object in the remaining frames of a video based on the size and position of the object in the first frame of a given video. The Convolutional Neural Network (CNNs) based methods have excellent performance in this field, most methods rely on traditional frame images (RGB images or grayscale images) to perform tracking, but the tracking effect of the frame image based tracker is reduced sharply under severe conditions (e.g., low light, fast motion conditions, etc.). To improve the robustness of the tracker in harsh environments, methods based on multiple modalities are gradually proposed, such as depth sensors and thermal infrared sensors, which can provide valuable additional information to improve the tracking effect. However, as with conventional frame-based sensors, depth and thermal infrared sensors also have limited frame rates and suffer from motion blur.
An event camera is a bionic sensor that asynchronously measures light intensity changes in a scene and outputs events. It therefore provides very high time resolution (up to 1MHz) and very little power consumption. Since the intensity variation is calculated in a logarithmic scale, it can operate at a very high dynamic range (140 dB). The event camera triggers the formation of "ON" and "OFF" events when the log scale pixel intensity variation is above or below a threshold. The event camera also has limitations, one of which is the inability to provide texture information and color information of the object. However, the frame image can easily acquire rich texture and semantic information of the object under normal conditions. The advantage of fusing the two domain data thus provides the possibility to solve visual target tracking in challenging scenarios. The related background art in this field is described in detail below.
(1) Single domain tracking based on frame images
At present, the mainstream frame image single-domain tracking method is mainly based on the principle of deep learning, and comprises a tracking model based on pre-training depth features, a tracking model based on off-line training features, and a tracking model fusing relevant filtering to a neural network. In the existing algorithm, features of an image are extracted by designing a convolution module or manually constructed image information by means of information such as texture and color of a target in a frame image, and finally, an observation feature is obtained so as to output the position of the target. Although single-domain tracking algorithms have achieved good results on relevant data sets, for some challenging scenes, such as low illumination, high dynamic contrast, fast motion, etc., the single-domain target tracking algorithm based on frame images still has difficulty achieving satisfactory results.
(2) Multi-domain tracking
The currently mainstream multi-domain tracking algorithm comprises the fusion of a frame image and depth information and the fusion of the frame image and thermal infrared information. The problem of mutual shielding between targets can be effectively solved according to the depth information of the environment. The thermal infrared imaging can effectively image in severe weather such as low illumination, rainy days, foggy days and the like, so that the tracking accuracy can be improved by combining with the frame image. The current research for fusing the event domain and the frame domain to track the target is still in the early exploration stage because the output data form of the event camera is completely different from that of the above-mentioned two sensors. Considering that the event camera can provide contour information of a moving object in a complex scene, how to combine the contour clue with abundant texture information in the frame image to improve the accuracy of the object tracking task is a problem worthy of research.
(3) Event tracking data set
Some studies have attempted to track using event data because the event camera captures a moving object well, but since the event data is stored in a form that is much different from the conventional frame image form, the event data is often superimposed and marked on the superimposed event frame. Hu et al collected a large event-based tracking dataset that collected event data by placing an event camera in front of the display screen and recorded a labeled gray scale, but since the display screen played discrete frames, the dataset failed to represent events between discrete frames. Mitrokhin et al collected two event-based tracking data sets: EED data set and EV-IMO data set. The EED data set only contains two tracking target classes, namely 179 frames (7.8 seconds) of gray scale images and corresponding labels, and the EV-IMO additionally provides masks of objects and increases the labeling frequency of event data to 200Hz, but only contains three tracking target classes in the data set.
Disclosure of Invention
Aiming at a moving target visual tracking task under the scenes of rapid movement and poor illumination, the invention firstly makes a moving target tracking data set based on an event camera, and simultaneously provides a visual target tracking algorithm based on cross-domain attention for accurately tracking a visual target based on the data set. The invention can take advantage of the respective advantages of combining the frame image and the event data: the frame image can provide rich texture information, and the event data can still provide clear object edge information in a challenging scene.
The technical scheme of the invention is as follows:
a moving target visual tracking method based on multi-source information fusion comprises the following steps:
(1) building a data set
The data set comprises 108 sequences, and 21 object categories in total, and covers various complex scenes such as high exposure, motion blur, HDR and the like; the data set provides event data and frame images taken in synchronization, and a target annotation box up to 240 HZ; according to the target category, the data set is divided into: animals, vehicles and everyday items (e.g. bottles and boxes). According to scene classification, the data set is divided into: low light, High Dynamic Range (HDR), fast motion with and without motion blur; according to whether the camera is moving and the number of objects, the data set is divided into: four scenes of single object motion in which the camera is stationary, single object motion in which the camera is moving, multiple object motion in which the camera is stationary, and multiple object motion in which the camera is moving;
(2) frame image feature extractor (FFE)
The frame image feature extractor is used for extracting features from the frame images, adopting ResNet18 as the frame image feature extractor, and taking the outputs of Block 4 and Block 5 in ResNet18 as low-level frame features and high-level frame features respectively;
(3) event image feature extraction module (EFE)
The event image feature extraction module is used for extracting features from event images of the event data stack, and the event data is expressed by the following formula:
Figure BDA0002886653600000041
wherein (x)k,yk) Is the pixel coordinate of the event, tkIs the time stamp of the event, pk± 1 is the polarity of the event; to input a captured asynchronous event into the event data feature extraction module, the event images are first superimposed to form an event image by: (a) aggregating events between two adjacent frame images into N three-dimensional images, and discretizing the events; (b) for each set of events, an event image is generated according to the following method:
Figure BDA0002886653600000042
Figure BDA0002886653600000043
where i denotes the ith branch, δ denotes the Dirac function, TjIs the time stamp corresponding to the jth frame image in the frame field, and B is the slice size in the time field, and its value is defined as B ═ (T)j+1-Tj)/2;
The event image feature extraction module also extracts low-level frame features and high-level frame features of the event image; meanwhile, in order to effectively aggregate event images in different time domains, the method fuses the feature maps by setting learnable parameters w which are the same as N, and the method specifically comprises the following steps:
Figure BDA0002886653600000044
wherein the content of the first and second substances,
Figure BDA0002886653600000047
denotes pixel addition, eiIs the output of the ith branch, wiIs the weight of the output, obtained by the training of the network;
meanwhile, in order to better extract effective information of an event, the method provides an edge attention module (EAB), and efficient extraction of frame edge information is completed through an adaptive attention mechanism, and the process is expressed as follows:
Figure BDA0002886653600000045
Figure BDA0002886653600000046
where σ is the Sigmoid activation function, ψ1×1Denotes a 1 × 1 convolution, κiRefers to the output characteristics of the edge attention module in the ith branch,
Figure BDA0002886653600000051
refers to the multiplication of elements and the multiplication of elements,
Figure BDA0002886653600000052
and
Figure BDA0002886653600000053
channel addition and adaptive average pooling are referred to, respectively;
(4) cross-domain information modulation selection module (CDMS)
The cross-domain information modulation selection module is used for fusing information between an event domain and a frame domain, and is designed based on the following observation: 1. texture information and semantic information are effectively captured through a traditional frame imaging camera, and an event camera can well extract object edge information, so that the advantages of the texture information and the semantic information can be simultaneously exerted. 2. For the traditional frame imaging camera, under the scenes of low illumination and high object motion speed, the imaging quality can be sharply reducedWhereas the event camera is not affected by such a scene, in which case the event information is more reliable than the frame information. 3. In a scene where a plurality of objects move simultaneously, the objects are difficult to distinguish only through edge information, and at the moment, texture differences between the objects can provide effective assistance. The cross-domain information modulation selection module provided by the method fuses the characteristics of two different domains through a cross-domain attention mechanism, and particularly, for two kinds of information D from different domains1And D2The two kinds of information can be fused through the following process:
Figure BDA0002886653600000054
Figure BDA0002886653600000055
Figure BDA0002886653600000056
wherein D is1Refers to features from the frame domain, D2Refer to features from the event domain, [.]The operation refers to a channel splicing operation;
Figure BDA0002886653600000057
a feature extraction process that represents the frame domain,
Figure BDA0002886653600000058
a feature extraction process representing an event domain; psi3×3Representing a convolution of 3 x 3, #5×5Representing a convolution of 5 x 5.
The invention has the beneficial effects that:
(1) event data set in large challenging scenarios
The target tracking task based on deep learning relies on a large number of datasets with annotations, which makes the event data difficult to label since the output of the event camera is an asynchronous stream. This patent uses the motion capture system VICON to capture the motion of an object to obtain a high frequency moving object calibration frame and in this way creates a single object tracking data set based on the DAVIS346 event camera. This data set will facilitate the relevant study of subsequent event camera based tracking algorithms.
(2) Fusion of frame and event fields
Due to the asynchrony of the event data, the method is different from the current method of performing feature fusion by using RGB-D and RGB-T, and the method for fusing the RGB data and the event data is explored for the first time. The CDFI module provided by the patent effectively extracts features from a frame domain and an event domain through a mutual aid mechanism, and performs fusion according to the reliability of the features. By respectively setting the weights of the two kinds of domain information in different scenes, the method can effectively integrate the advantages of the two kinds of sensors so as to solve the problem of target tracking under complex conditions.
Drawings
Fig. 1 is a structural diagram of a cross-domain feature integration module (CDFI) according to the present invention, which includes a gray data feature extraction module (FFE), an event image feature extraction module (EFE), and a cross-domain information modulation selection module (CDMS).
FIG. 2 is a flow diagram of target tracking based on event and frame image fusion.
Detailed Description
The present invention will be described in further detail with reference to specific embodiments, but the present invention is not limited to the specific embodiments.
A visual tracking method for a moving target based on fusion event and frame domain information features comprises the steps of manufacturing a data set, training a network model and testing.
(1) Training data set generation
In order to label the moving target for the gray frame and the event stream of the event camera, two steps need to be completed: and (4) transforming a camera coordinate system and transforming the coordinates of the target positioning point.
To achieve coordinate system conversion between an event camera and a VICON system, we first determine the event camera matrix K and the distortion coefficient d using a calibration plate. Then, we can obtain the rotation vector r and the translation vector t of DAVIS346 by the following formula,
r,t=S(K,d,pi,Pi),i=1,2,…,25(10)
wherein S represents the SolvePnP method, piIs a set of 2D points on a gray scale image from APS, and PiIs a set of 3D points on the target from Vicon. To obtain piAnd PiWe use a T-shape orthotic frame that both Vicon and APS can track. The correcting rack is provided with 5 infrared luminous points, the correcting rack is placed at 5 different positions, and a total of 25 paired p can be collectediAnd Pi. The Vicon captured 3D points are converted to 2D point coordinates of the gray frame image by the following equation:
Figure BDA0002886653600000071
wherein [ x ]j yj 1]TAnd [ X ]j Yj Zj]TRepresenting the 2D and 3D coordinates of the jth mark on the object, respectively. From the above information, the label bounding box of the target can be obtained by calculating the maximum and minimum values of all 2D points.
Figure BDA0002886653600000072
Wherein (x)l,yl) Is the coordinate of the point in the upper left corner of the label box, (x)r,yr) Is the coordinate of the point in the lower right-hand corner of the label box. We can obtain the width w and height h of the bounding box by the following equations,
w=xr-xl,h=yr-yl (13)
(2) network training
For FFE, its parameters are initialized using the Resnet18 model pre-trained on the ImageNet dataset. For EFE, the size of N is set to 3. The batch size of the model was set to 26. To train this network, the model parameters were updated using Adam as the optimizer, the number of iterations was set to 50, the decay factor for the learning rate was set to 0.2, and the decay was once every 15 iterations. The learning rates of the classifier, the frame regressor, and the CDFI are set to 0.001, 0.001, and 0.0001, respectively.

Claims (1)

1. A moving target visual tracking method based on multi-source information fusion is characterized by comprising the following steps:
(1) building a data set
The data set comprises 108 sequences, and 21 object categories in total, and covers various complex scenes such as high exposure, motion blur and HDR; the data set provides event data and frame images taken in synchronization, and a target annotation box up to 240 HZ; according to the target category, the data set is divided into: animals, vehicles and everyday goods; according to scene classification, the data set is divided into: low light, high dynamic range, fast motion with motion blur and fast motion without motion blur; according to whether the camera is moving and the number of objects, the data set is divided into: four scenes of single object motion in which the camera is stationary, single object motion in which the camera is moving, multiple object motion in which the camera is stationary, and multiple object motion in which the camera is moving;
(2) frame image feature extractor FFE
The frame image feature extractor is used for extracting features from the frame images, adopting ResNet18 as the frame image feature extractor, and taking the outputs of Block 4 and Block 5 in ResNet18 as low-level frame features and high-level frame features respectively;
(3) event image feature extraction module EFE
The event image feature extraction module is used for extracting features from event images of the event data stack, and the event data is expressed by the following formula:
Figure FDA0002886653590000011
wherein (x)k,yk) Is the pixel coordinate of the event, tkIs the time stamp of the event, pk± 1 is the polarity of the event; to input a captured asynchronous event into the event data feature extraction module, the event images are first superimposed to form an event image by: (a) for two adjacentAggregating events among the frame images into N three-dimensional images, and discretizing the events; (b) for each set of events, an event image is generated according to the following method:
Figure FDA0002886653590000012
Figure FDA0002886653590000013
where i denotes the ith branch, δ denotes the Dirac function, TjIs the time stamp corresponding to the jth frame image in the frame field, and B is the slice size in the time field, and its value is defined as B ═ (T)j+1-Tj)/2;
The event image feature extraction module also extracts low-level frame features and high-level frame features of the event image; meanwhile, in order to effectively aggregate event images in different time domains, the method fuses the feature maps by setting learnable parameters w which are the same as N, and the method specifically comprises the following steps:
Figure FDA0002886653590000021
wherein the content of the first and second substances,
Figure FDA0002886653590000022
denotes pixel addition, eiIs the output of the ith branch, wiIs the weight of the output, obtained by the training of the network;
meanwhile, in order to better extract effective information of an event, the method provides an edge attention module EAB, and completes efficient extraction of frame edge information through a self-adaptive attention mechanism, wherein the process is expressed as follows:
Figure FDA0002886653590000023
Figure FDA0002886653590000024
where σ is the Sigmoid activation function, ψ1×1Denotes a 1 × 1 convolution, κiRefers to the output characteristics of the edge attention module in the ith branch,
Figure FDA0002886653590000025
refers to the multiplication of elements and the multiplication of elements,
Figure FDA0002886653590000026
and
Figure FDA0002886653590000027
channel addition and adaptive average pooling are referred to, respectively;
(4) cross-domain information modulation selection module CDMS
The cross-domain information modulation selection module is used for fusing information between an event domain and a frame domain; the cross-domain information modulation selection module provided by the method fuses the characteristics of two different domains through a cross-domain attention mechanism, and particularly, for two kinds of information D from different domains1And D2The two kinds of information are fused through the following process:
Figure FDA0002886653590000028
Figure FDA0002886653590000029
Figure FDA00028866535900000210
wherein D is1Refers to features from the frame domain, D2Refer to features from the event domain, [.]The operation refers to a channel splicing operation;
Figure FDA00028866535900000211
a feature extraction process that represents the frame domain,
Figure FDA00028866535900000212
a feature extraction process representing an event domain; psi3×3Representing a convolution of 3 x 3, #5×5Representing a convolution of 5 x 5.
CN202110015551.5A 2021-01-07 2021-01-07 Moving target visual tracking method based on multi-source information fusion Active CN112686928B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110015551.5A CN112686928B (en) 2021-01-07 2021-01-07 Moving target visual tracking method based on multi-source information fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110015551.5A CN112686928B (en) 2021-01-07 2021-01-07 Moving target visual tracking method based on multi-source information fusion

Publications (2)

Publication Number Publication Date
CN112686928A true CN112686928A (en) 2021-04-20
CN112686928B CN112686928B (en) 2022-10-14

Family

ID=75456139

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110015551.5A Active CN112686928B (en) 2021-01-07 2021-01-07 Moving target visual tracking method based on multi-source information fusion

Country Status (1)

Country Link
CN (1) CN112686928B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113269699A (en) * 2021-04-22 2021-08-17 天津(滨海)人工智能军民融合创新中心 Optical flow estimation method and system based on fusion of asynchronous event flow and gray level image
CN113378917A (en) * 2021-06-09 2021-09-10 深圳龙岗智能视听研究院 Event camera target identification method based on self-attention mechanism
CN115631407A (en) * 2022-11-10 2023-01-20 中国石油大学(华东) Underwater transparent biological detection based on event camera and color frame image fusion
CN116188533A (en) * 2023-04-23 2023-05-30 深圳时识科技有限公司 Feature point tracking method and device and electronic equipment
CN116206196A (en) * 2023-04-27 2023-06-02 吉林大学 Ocean low-light environment multi-target detection method and detection system thereof
CN116309781A (en) * 2023-05-18 2023-06-23 吉林大学 Cross-modal fusion-based underwater visual target ranging method and device
CN117808847A (en) * 2024-02-29 2024-04-02 中国科学院光电技术研究所 Space non-cooperative target feature tracking method integrating bionic dynamic vision

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110148159A (en) * 2019-05-20 2019-08-20 厦门大学 A kind of asynchronous method for tracking target based on event camera
CN112037269A (en) * 2020-08-24 2020-12-04 大连理工大学 Visual moving target tracking method based on multi-domain collaborative feature expression

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110148159A (en) * 2019-05-20 2019-08-20 厦门大学 A kind of asynchronous method for tracking target based on event camera
CN112037269A (en) * 2020-08-24 2020-12-04 大连理工大学 Visual moving target tracking method based on multi-domain collaborative feature expression

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王轶等: "基于DST-PCR5多目标自适应视觉跟踪方法", 《计算机应用研究》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113269699B (en) * 2021-04-22 2023-01-03 天津(滨海)人工智能军民融合创新中心 Optical flow estimation method and system based on fusion of asynchronous event flow and gray level image
CN113269699A (en) * 2021-04-22 2021-08-17 天津(滨海)人工智能军民融合创新中心 Optical flow estimation method and system based on fusion of asynchronous event flow and gray level image
CN113378917B (en) * 2021-06-09 2023-06-09 深圳龙岗智能视听研究院 Event camera target recognition method based on self-attention mechanism
CN113378917A (en) * 2021-06-09 2021-09-10 深圳龙岗智能视听研究院 Event camera target identification method based on self-attention mechanism
CN115631407A (en) * 2022-11-10 2023-01-20 中国石油大学(华东) Underwater transparent biological detection based on event camera and color frame image fusion
CN115631407B (en) * 2022-11-10 2023-10-20 中国石油大学(华东) Underwater transparent biological detection based on fusion of event camera and color frame image
CN116188533A (en) * 2023-04-23 2023-05-30 深圳时识科技有限公司 Feature point tracking method and device and electronic equipment
CN116188533B (en) * 2023-04-23 2023-08-08 深圳时识科技有限公司 Feature point tracking method and device and electronic equipment
CN116206196A (en) * 2023-04-27 2023-06-02 吉林大学 Ocean low-light environment multi-target detection method and detection system thereof
CN116206196B (en) * 2023-04-27 2023-08-08 吉林大学 Ocean low-light environment multi-target detection method and detection system thereof
CN116309781A (en) * 2023-05-18 2023-06-23 吉林大学 Cross-modal fusion-based underwater visual target ranging method and device
CN116309781B (en) * 2023-05-18 2023-08-22 吉林大学 Cross-modal fusion-based underwater visual target ranging method and device
CN117808847A (en) * 2024-02-29 2024-04-02 中国科学院光电技术研究所 Space non-cooperative target feature tracking method integrating bionic dynamic vision

Also Published As

Publication number Publication date
CN112686928B (en) 2022-10-14

Similar Documents

Publication Publication Date Title
CN112686928B (en) Moving target visual tracking method based on multi-source information fusion
CN111209810B (en) Boundary frame segmentation supervision deep neural network architecture for accurately detecting pedestrians in real time through visible light and infrared images
Jiao et al. New generation deep learning for video object detection: A survey
CN110443827B (en) Unmanned aerial vehicle video single-target long-term tracking method based on improved twin network
Baldwin et al. Time-ordered recent event (tore) volumes for event cameras
CN106845374B (en) Pedestrian detection method and detection device based on deep learning
CN112037269B (en) Visual moving target tracking method based on multi-domain collaborative feature expression
CN109684925B (en) Depth image-based human face living body detection method and device
CN105160310A (en) 3D (three-dimensional) convolutional neural network based human body behavior recognition method
Chen et al. End-to-end learning of object motion estimation from retinal events for event-based object tracking
CN111539273A (en) Traffic video background modeling method and system
CN109697726A (en) A kind of end-to-end target method for estimating based on event camera
CN110929593A (en) Real-time significance pedestrian detection method based on detail distinguishing and distinguishing
CN103984955B (en) Multi-camera object identification method based on salience features and migration incremental learning
WO2019136591A1 (en) Salient object detection method and system for weak supervision-based spatio-temporal cascade neural network
CN107392131A (en) A kind of action identification method based on skeleton nodal distance
CN113408584B (en) RGB-D multi-modal feature fusion 3D target detection method
CN107133610B (en) Visual detection and counting method for traffic flow under complex road conditions
CN113762009B (en) Crowd counting method based on multi-scale feature fusion and double-attention mechanism
CN113592911A (en) Apparent enhanced depth target tracking method
CN114821764A (en) Gesture image recognition method and system based on KCF tracking detection
CN115661246A (en) Attitude estimation method based on self-supervision learning
CN106570885A (en) Background modeling method based on brightness and texture fusion threshold value
Liang et al. Methods of moving target detection and behavior recognition in intelligent vision monitoring.
CN108009512A (en) A kind of recognition methods again of the personage based on convolutional neural networks feature learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant