CN111723600A - Pedestrian re-recognition feature descriptor based on multi-task learning - Google Patents

Pedestrian re-recognition feature descriptor based on multi-task learning Download PDF

Info

Publication number
CN111723600A
CN111723600A CN201910205685.6A CN201910205685A CN111723600A CN 111723600 A CN111723600 A CN 111723600A CN 201910205685 A CN201910205685 A CN 201910205685A CN 111723600 A CN111723600 A CN 111723600A
Authority
CN
China
Prior art keywords
features
network
layer
lomo
deep
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910205685.6A
Other languages
Chinese (zh)
Other versions
CN111723600B (en
Inventor
何小海
刘康凝
熊淑华
其他发明人请求不公开姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201910205685.6A priority Critical patent/CN111723600B/en
Publication of CN111723600A publication Critical patent/CN111723600A/en
Application granted granted Critical
Publication of CN111723600B publication Critical patent/CN111723600B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a pedestrian re-recognition feature descriptor based on multi-task learning, which adopts a twin network structure input in pairs, and sends Local maximum appearance (LOMO) features and Deep features into a network together and maps the LOMO features and the Deep features into a single feature space for training to form a new network model TDFN (digital and Deep features Fusion network). The neural network self-learning characteristic is utilized, and the loss functions of various tasks are combined to update the network, so that the deep features learn more detailed information which is complementary with the manual local features, and new discriminative features are obtained. Experiments show that the average accuracy mAP and Rank-1 accuracy of the novel characteristics of the invention are superior to the global descriptor directly extracted from the twin network. The method is suitable for application systems in the aspects of safety and monitoring, such as video monitoring analysis, content-based image and video retrieval.

Description

Pedestrian re-recognition feature descriptor based on multi-task learning
Technical Field
The invention relates to a pedestrian re-identification problem in the field of intelligent video monitoring, in particular to a pedestrian re-identification feature descriptor based on multi-task learning and a new network model TDFN (traditional and discrete Fusion network).
Background
Pedestrian Re-Identification (Re-Identification) aims at matching image frames containing the same pedestrian in cross-camera monitoring videos, and is a challenging subject in the field of intelligent monitoring analysis. Pedestrian re-identification has attracted a great deal of attention in the industry and academia due to its important applications in security and surveillance, such as video surveillance analysis and content-based image, video retrieval. The re-recognition model generally includes two parts, characterization learning and metric learning. In typical re-identification, each pedestrian picture is usually described using a single feature, and then these features are matched in the metric space of a particular task, where feature vectors of the same pedestrian have a smaller distance than feature vectors of different pedestrians.
In a real scene, due to the fact that visual angles, illumination, background clutter, occlusion and other significant changes exist under different cameras, the same pedestrian often has large differences in non-overlapping camera views. The combination of different visual characteristics is manually made, so that the change of cross viewpoints in a re-recognition task can be overcome, and the method is sometimes more reliable. Among the hand-made features, color and texture are the two most useful of them. For example, HSV and LAB color histogram information is used to measure color information in an image, and LBP histograms and Gabor filters are used to describe image texture information. Although the manual features have certain uniqueness, the manual features have inferior effect to the pedestrian features extracted by using deep learning. In recent years, many algorithms learn the corresponding features directly from the original input picture through neural networks, and different networks have been studied for pedestrian re-recognition. For example, a twin network structure for pedestrian re-identification is studied by jointly learning identification loss, verification loss, a ternary network is studied by learning relative similarities between three types of images (including anchor point, positive and negative pair images), and a quadruple deep network for learning an edge-based hard-case mining strategy from four input images. These methods can effectively learn global pedestrian representations, but they ignore very rich information around the local position of the body, and in some scenarios may instead produce suboptimal results. The LOMO feature is a manually-made local feature, is composed of color and texture histogram information of local blocks, and contains rich detail information. Based on this, the LOMO features are complementary to the deep features learned by the twin network.
Disclosure of Invention
The invention provides a pedestrian re-recognition feature descriptor based on multi-task learning, which adopts a twin network structure input in pairs, and sends Local maximum appearance (LOMO) features and Deep features into a network together and maps the LOMO features and the Deep features into a single feature space for training to form a new network model TDFN (localized Deep features Fusion network). The neural network self-learning characteristic is utilized, and the loss functions of various tasks are combined to update the network, so that the deep features learn more detailed information which is complementary with the manual local features, and new discriminative features are obtained.
The invention realizes the purpose through the following technical scheme:
(1) deep features and Local maximum occurrence (LOMO) features of the paired pictures are extracted, and the LOMO features are reduced in dimensionality using the full-connectivity layer.
(2) The deep features and the reduced-dimensionality LOMO features are fed into a network and mapped into a single feature space for training.
(3) The network uses a multitask learning network, not only analyzes the pedestrian similarity of two pictures, but also predicts the pedestrian identity in each picture.
(4) The loss functions of a plurality of tasks are combined, self-learning of the neural network is utilized, and learning of deep features is influenced by detail information in the LOMO features.
Drawings
FIG. 1 is a diagram of a pedestrian re-identification feature description sub-frame based on multitask learning;
Detailed Description
The invention will be further described with reference to the accompanying drawings in which:
the TDFN model network structure is specifically as follows:
the model adopts a twin network structure, comprising two CNN models (resulting from the removal of the last layer of FC by the ResNet-50 network), and the two CNN models share weights. Two pictures are input, and two deep features are output by the two CNN models. In addition, LOMO characteristics of the two pictures are extracted and sent to a full connection layer to reduce dimensionality, so that the huge difference between the dimensionality of the two characteristics can be alleviated for fusion. And then, the deep features extracted by the twin network and the LOMO features subjected to dimensionality reduction are sent to a Merge1 layer and a Merge2 layer together for fusion of the two features, and then the deep features and the LOMO features are sent to an FC3 layer and an FC4 layer together for learning, so that two new features are obtained. The network has three tasks (two tasks for predicting the identity of the pedestrian and one task for acquiring the similarity of the pedestrian of two images), the loss functions generated by each task are weighted together to update the network, the self-learning characteristic of the neural network is utilized, the parameters of the image convolution kernel are optimized, the deep features are promoted to learn more detailed information which is complementary with the LOMO features, and therefore the new features with discriminative power are obtained.
The new fusion characteristics of the TDFN model are specifically as follows:
to obtain a better feature representation, a large number of images are required for model training. However, there are not as many images in the re-recognition dataset, so the present invention extracts deep features using twin networks pre-trained on ImageNet parameters. Although features generated by twin networks can effectively learn global pedestrian representation, the features ignore very rich information around local positions of bodies, and the LOMO features are manually made local features which are formed by color and texture histogram information of local blocks and contain rich detailed information, and the two features have complementarity. Therefore, the LOMO characteristic and the deep characteristic of the paired pictures are extracted, the two characteristics are sent into a network for multi-task learning to train, the network is updated by weighting different task loss functions by utilizing the principle of back propagation, the extraction of the deep characteristic is normalized by detail information in local manual characteristics, and the new characteristic with translation invariance is obtained. Inputting two pictures piAnd pjRespectively acquiring deep features and LOMO features of two pictures, sending the deep features and the LOMO features into a Merge1 layer and a Merge2 layer together, and forming two new features through a full connection layer FC3 layer and a full connection layer FC4 layer
Figure BDA0001997493170000031
And
Figure BDA0001997493170000032
input to FC3 and FC4 layers:
x1=[LOMO1,Deep_Feature1](1)
x2=[LOMO2,Deep_Feature2](2)
the outputs of the FC3 layer and FC4 are:
Figure BDA0001997493170000033
Figure BDA0001997493170000034
where h (-) is the activation function, the activation function ReLU is employed at the FC3 and FC4 levels and a discard level is used to learn redundant expressions, prevent overfitting, and the discard rate is set to 0.5. According to the principle of back propagation, the weight of the z-th layer after iteration is:
Figure BDA0001997493170000035
Figure BDA0001997493170000036
the new characteristics of the joint multitask lost learning are as follows:
in the TDFN network, such a joint loss based on multi-task learning can better extract features by not only effectively extracting features for each image, but also comparing pairs of pictures through a deep network. The two new features of the full-connection layer are learned by joint loss of multi-task learning, and extraction of deep features is influenced by local blocks in manual features through back propagation, so that the two features are subjected to complementary learning. The model comprises three tasks, wherein the three tasks comprise a task for acquiring the similarity of the pedestrians and two tasks for predicting the identity of the pedestrians, and the specific process comprises the following steps:
acquiring pedestrian similarity: two pedestrian descriptors of FC3 layer and FC4 layer
Figure BDA0001997493170000041
And
Figure BDA0001997493170000042
entering a Square layer, calculating the Square difference of the two elements one by one:
Figure BDA0001997493170000043
then using a convolution layer
Figure BDA0001997493170000044
Converting into a two-dimensional vector representing the similarity of two pictures
Figure BDA0001997493170000045
Wherein,
Figure BDA0001997493170000046
θsdenotes the parameters of the convolutional layer, o denotes the convolution operation, sigmoid is the activation function. And is
Figure BDA0001997493170000047
Comparing the similarity score with the real matching degree of the two pictures to obtain a verification loss, wherein the calculation method comprises the following steps:
Figure BDA0001997493170000048
when p isiAnd pjIs the same person, then qi1, otherwise qj=0。
And (3) predicting the identity of the pedestrian: each pedestrian descriptor
Figure BDA0001997493170000049
The input convolutional layer is mapped into a one-dimensional vector with the size of K, and the value of K is the same as the pedestrian category number of the data set. Pedestrian identity is then predicted using the Softmax layer, whichThe output is:
Figure BDA00019974931700000410
wherein, thetaiRepresents the parameters of the convolutional layer, o represents the convolution operation,
Figure BDA00019974931700000411
for predicting the identity of two input pictures. Will be provided with
Figure BDA00019974931700000412
Compared with the true identity label of the corresponding picture, the recognition loss can be calculated:
Figure BDA00019974931700000413
wherein
Figure BDA00019974931700000414
Representing the identity of the input picture. When in use
Figure BDA00019974931700000415
And all other k values
Figure BDA00019974931700000416
When the picture is input, the identity of the picture is t.
Finally, the loss function of the network herein is defined as:
LOSSMuti=LOSSv+LOSSid(12)
the deep level feature and LOMO feature complementary learning is as follows:
in the training process, the deep characteristic of one picture is assumed to be f, and the LOMO characteristic is assumed to be f
Figure BDA00019974931700000417
The input of the full connection layer FC3 is
Figure BDA0001997493170000051
Suppose that
Figure BDA0001997493170000052
In order to connect the weight of the nth layer j node and the nth-1 layer i node, the nth layer j node outputs in forward propagation:
Figure BDA0001997493170000053
wherein
Figure BDA0001997493170000054
The LOSS function LOSS weighted by multiple tasks by using a stochastic gradient descent method during trainingMutiThe gradient generated
Figure BDA0001997493170000055
And learning rate α update weights
Figure BDA0001997493170000056
To optimize the network. Wherein the gradient is
Figure BDA0001997493170000057
The calculation formula of (A) is as follows:
Figure BDA0001997493170000058
suppose the output of FC3 at level 6 q-node is
Figure BDA0001997493170000059
Then, the following equations (13), (14) and (5) can be obtained:
Figure BDA00019974931700000510
Figure BDA00019974931700000511
Figure BDA00019974931700000512
thus in a TDFN network
Figure BDA00019974931700000513
Influencing in two ways
Figure BDA00019974931700000514
First LOMO feature
Figure BDA00019974931700000515
By passing
Figure BDA00019974931700000516
Forward propagation, and thus weighting of deep features
Figure BDA00019974931700000517
In the back propagation of updated network
Figure BDA00019974931700000518
The influence of (c). Second, the LOSS function LOSSMutiOutput gradient of
Figure BDA00019974931700000519
Will also receive
Figure BDA00019974931700000520
Thereby affecting the weight of the deep features
Figure BDA00019974931700000521
And complementary learning of the two characteristics is realized.
The invention uses two different metric learning methods to verify the proposed features on the Market1501 and DukeMTMC-ReID databases and compares the features with a reference model and some mainstream algorithms respectively. The evaluation was performed using a single query setup and two evaluation indices of Rank-k precision (k ═ 1, 5, 10) and mean precision (mAP) were used. The results of the experiments are shown in tables 1, 2 and 3:
table 1 results of comparison with the reference model
Figure BDA0001997493170000061
TABLE 2 Market1501 data set vs. mainstream algorithm results
Figure BDA0001997493170000062
TABLE 3 DukeMTMC-reiD data set vs. mainstream Algorithm results
Figure BDA0001997493170000063

Claims (5)

1. A pedestrian re-identification feature descriptor based on multitask learning is characterized by comprising the following steps:
(1) extracting Local maximum appearance (LOMO) features and deep features in the paired pictures, and reducing the dimension of the LOMO features by using a full connection layer;
(2) sending the Deep features and the LOMO features after dimensionality reduction into a network and mapping the Deep features and the LOMO features into a single feature space for training to form a new model TDFN (traditional and Deep features Fusion network);
(3) the TDFN model uses a multitask learning network, not only analyzes the pedestrian similarity of paired pictures, but also predicts the pedestrian identity in each picture;
(4) the loss function of multiple tasks is combined to update the network, and the self-learning characteristic of the neural network is utilized to promote deep features to learn more detailed information complementary with the LOMO features.
2. The method of claim 1, wherein in step (1) the deep features are extracted using a twin network, wherein the twin network comprises two CNN models pre-trained on ImageNet parameters, and the backbone network of the CNN model is obtained by removing the last FC layer from a ResNet-50 network; in addition, the LOMO features of the paired pictures are extracted and reduced in dimension using the fully connected layer, which can mitigate the large difference between the two feature dimensions for fusion.
3. The method according to claim 1, wherein in step (2), the deep layer feature and the reduced dimension LOMO feature of the two pictures are combined together into a Merge1 layer and a Merge2 layer, and two new features are formed by fully connecting the FC3 layer and the FC4 layer
Figure FDA0001997493160000011
And
Figure FDA0001997493160000012
then the two new characteristics are sent to a multi-task learning network for training, and two pictures p are inputiAnd pjThe inputs to the FC3 layer and the FC4 layer are:
x1=[LOMO1,Deep_Feature1](1)
x2=[LOMO2,Deep_Feature2](2)
the outputs of the FC3 layer and FC4 are:
Figure FDA0001997493160000013
Figure FDA0001997493160000014
4. the method according to claim 1, wherein in step (3), not only the feature extraction is effectively performed on each image, but also the comparison is performed on the paired images through the depth network, and the model has three tasks, including a task of acquiring the pedestrian similarity and two tasks of predicting the identity of the pedestrian, and the specific processes are as follows:
acquiring pedestrian similarity: two pedestrian descriptors of FC3 layer and FC4 layer
Figure FDA0001997493160000021
And
Figure FDA0001997493160000022
entering a Square layer, calculating the Square difference of the two elements one by one:
Figure FDA0001997493160000023
then using a convolution layer
Figure FDA0001997493160000024
Converting into a two-dimensional vector representing the similarity of two pictures
Figure FDA0001997493160000025
Comparing the similarity score with the real matching degree of the two pictures to obtain a verification loss, wherein the calculation method comprises the following steps:
Figure FDA0001997493160000026
and (3) predicting the identity of the pedestrian: each pedestrian descriptor
Figure FDA0001997493160000027
The pedestrian identity is predicted by using a Softmax layer, and the output is as follows:
Figure FDA0001997493160000028
will be provided with
Figure FDA0001997493160000029
Compared with the true identity label of the corresponding picture, the recognition loss can be calculated:
Figure FDA00019974931600000210
5. the method of claim 1, wherein the gradient of a node in the deep feature learning process in step (4) is influenced by the LOMO feature in two ways. Firstly, information in the LOMO features needs to be propagated through a ReLU function of a full connection layer, so that extraction of deep features can be self-adapted to a convolution kernel according to the LOMO features, and forward complementarity of the deep features is realized; second, LOSS function LOSS weighted by multiple tasksMutiThe output gradient of (a) may also be affected by the LOMO characteristics, among others LOSSMutiThe calculation formula of (A) is as follows:
LOSSMuti=LOSSv+LOSSid(9)
based on this, the deep features learn detailed information complementary to the LOMO when propagating back to update the network.
CN201910205685.6A 2019-03-18 2019-03-18 Pedestrian re-recognition feature descriptor based on multi-task learning Active CN111723600B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910205685.6A CN111723600B (en) 2019-03-18 2019-03-18 Pedestrian re-recognition feature descriptor based on multi-task learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910205685.6A CN111723600B (en) 2019-03-18 2019-03-18 Pedestrian re-recognition feature descriptor based on multi-task learning

Publications (2)

Publication Number Publication Date
CN111723600A true CN111723600A (en) 2020-09-29
CN111723600B CN111723600B (en) 2022-07-05

Family

ID=72562847

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910205685.6A Active CN111723600B (en) 2019-03-18 2019-03-18 Pedestrian re-recognition feature descriptor based on multi-task learning

Country Status (1)

Country Link
CN (1) CN111723600B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507835A (en) * 2020-12-01 2021-03-16 燕山大学 Method and system for analyzing multi-target object behaviors based on deep learning technology
CN113449777A (en) * 2021-06-08 2021-09-28 上海深至信息科技有限公司 Automatic thyroid nodule grading method and system
CN114581425A (en) * 2022-03-10 2022-06-03 四川大学 Myocardial segment defect image processing method based on deep neural network
CN114821354A (en) * 2022-04-19 2022-07-29 福州大学 Urban building change remote sensing detection method based on twin multitask network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709449A (en) * 2016-12-22 2017-05-24 深圳市深网视界科技有限公司 Pedestrian re-recognition method and system based on deep learning and reinforcement learning
US20180253596A1 (en) * 2017-03-06 2018-09-06 Conduent Business Services, Llc System and method for person re-identification using overhead view images
CN108596010A (en) * 2017-12-31 2018-09-28 厦门大学 The implementation method of pedestrian's weight identifying system
CN108960141A (en) * 2018-07-04 2018-12-07 国家新闻出版广电总局广播科学研究院 Pedestrian's recognition methods again based on enhanced depth convolutional neural networks

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709449A (en) * 2016-12-22 2017-05-24 深圳市深网视界科技有限公司 Pedestrian re-recognition method and system based on deep learning and reinforcement learning
US20180253596A1 (en) * 2017-03-06 2018-09-06 Conduent Business Services, Llc System and method for person re-identification using overhead view images
CN108596010A (en) * 2017-12-31 2018-09-28 厦门大学 The implementation method of pedestrian's weight identifying system
CN108960141A (en) * 2018-07-04 2018-12-07 国家新闻出版广电总局广播科学研究院 Pedestrian's recognition methods again based on enhanced depth convolutional neural networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHENG WANG 等: "Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification", 《PROCEEDINGS OF THE EUROPEAN CONFERENCE ON COMPUTER VISION (ECCV)》 *
IGOR BARROSBARBOSA 等: "Looking beyond appearances: Synthetic training data for deep CNNs in re-identification", 《COMPUTER VISION AND IMAGE UNDERSTANDING》 *
SHANGXUAN WU 等: "An enhanced deep feature representation for person re-identification", 《2016 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV)》 *
齐美彬 等: "多特征融合与交替方向乘子法的行人再识别", 《中国图象图形学报》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507835A (en) * 2020-12-01 2021-03-16 燕山大学 Method and system for analyzing multi-target object behaviors based on deep learning technology
CN113449777A (en) * 2021-06-08 2021-09-28 上海深至信息科技有限公司 Automatic thyroid nodule grading method and system
CN114581425A (en) * 2022-03-10 2022-06-03 四川大学 Myocardial segment defect image processing method based on deep neural network
CN114581425B (en) * 2022-03-10 2022-11-01 四川大学 Myocardial segment defect image processing method based on deep neural network
CN114821354A (en) * 2022-04-19 2022-07-29 福州大学 Urban building change remote sensing detection method based on twin multitask network
CN114821354B (en) * 2022-04-19 2024-06-07 福州大学 Urban building change remote sensing detection method based on twin multitasking network

Also Published As

Publication number Publication date
CN111723600B (en) 2022-07-05

Similar Documents

Publication Publication Date Title
CN111723600B (en) Pedestrian re-recognition feature descriptor based on multi-task learning
CN107408211B (en) Method for re-identification of objects
WO2020228525A1 (en) Place recognition method and apparatus, model training method and apparatus for place recognition, and electronic device
CN106919920B (en) Scene recognition method based on convolution characteristics and space vision bag-of-words model
Kagaya et al. Highly accurate food/non-food image classification based on a deep convolutional neural network
CN110135249B (en) Human behavior identification method based on time attention mechanism and LSTM (least Square TM)
CN112184752A (en) Video target tracking method based on pyramid convolution
CN109063649B (en) Pedestrian re-identification method based on twin pedestrian alignment residual error network
CN110457515B (en) Three-dimensional model retrieval method of multi-view neural network based on global feature capture aggregation
Xia et al. Loop closure detection for visual SLAM using PCANet features
CN110222718B (en) Image processing method and device
Haque et al. Two-handed bangla sign language recognition using principal component analysis (PCA) and KNN algorithm
CN110046544A (en) Digital gesture identification method based on convolutional neural networks
CN104281572A (en) Target matching method and system based on mutual information
CN112507778B (en) Loop detection method of improved bag-of-words model based on line characteristics
US11908222B1 (en) Occluded pedestrian re-identification method based on pose estimation and background suppression
CN114419732A (en) HRNet human body posture identification method based on attention mechanism optimization
CN115280373A (en) Managing occlusions in twin network tracking using structured dropping
CN113361549A (en) Model updating method and related device
CN112906520A (en) Gesture coding-based action recognition method and device
CN109255339A (en) Classification method based on adaptive depth forest body gait energy diagram
CN111291785A (en) Target detection method, device, equipment and storage medium
Ahmad et al. Embedded deep vision in smart cameras for multi-view objects representation and retrieval
Zhang et al. Visual Object Tracking via Cascaded RPN Fusion and Coordinate Attention.
Özyurt et al. A new method for classification of images using convolutional neural network based on Dwt-Svd perceptual hash function

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant