CN113190706A - Twin network image retrieval method based on second-order attention mechanism - Google Patents

Twin network image retrieval method based on second-order attention mechanism Download PDF

Info

Publication number
CN113190706A
CN113190706A CN202110410902.2A CN202110410902A CN113190706A CN 113190706 A CN113190706 A CN 113190706A CN 202110410902 A CN202110410902 A CN 202110410902A CN 113190706 A CN113190706 A CN 113190706A
Authority
CN
China
Prior art keywords
image
training
training image
query image
order attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110410902.2A
Other languages
Chinese (zh)
Inventor
廖开阳
范冰
郑元林
章明珠
黄港
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN202110410902.2A priority Critical patent/CN113190706A/en
Publication of CN113190706A publication Critical patent/CN113190706A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Library & Information Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a twin network image retrieval method based on a second-order attention mechanism, which comprises the following steps of: performing background subtraction processing on the query image and the training image; adding a second-order attention mechanism after the convolution layer of the convolution neural network to obtain a second-order attention convolution neural network; respectively inputting the processed query image and the processed training image into a second-order attention convolution neural network for feature extraction to obtain a query image feature and a training image feature; carrying out global average pooling and L2 normalization on the query image features and the training image features to obtain query image descriptors and training image descriptors; similarity measurement is carried out on the query image descriptors and the training image descriptors, and the training image descriptors are ranked according to the similarity to obtain a ranking result; and rearranging the sequencing result, and retrieving to obtain a training image most similar to the query image. The method can improve the retrieval precision, save the retrieval time and realize the aims of rapidness, high efficiency and accuracy.

Description

Twin network image retrieval method based on second-order attention mechanism
Technical Field
The invention belongs to the technical field of image processing methods, and relates to a twin network image retrieval method based on a second-order attention mechanism.
Background
In the internet era, heterogeneous data such as images, videos, audios, texts and the like are increasing at an alarming rate every day, particularly with the popularity of social networking sites such as Flickr, Facebook and the like. For example, Facebook registers more than 10 hundred million users, uploading more than 10 hundred million pictures per month; the number of pictures uploaded by the users in 2015 year of the Flickr picture social network site reaches 7.28 hundred million, and about 200 million pictures are uploaded by the users on average each day; 286 hundred million pictures are stored in the back end system of the Taobao network of the largest electronic commerce system in China. For these massive pictures containing rich visual information, how to conveniently, quickly and accurately query and retrieve the images needed or interested by users in these vast image libraries becomes a hotspot of research in the field of multimedia information retrieval. The image retrieval method based on the content gives full play to the advantage that a computer is longer than a computer for processing repeated tasks, and frees people from manual labeling which needs to consume a large amount of manpower, material resources and financial resources. With the development of the ten years, the content-based image retrieval technology has been widely applied to the aspects of life such as search engines, electronic commerce, medicine, textile industry, leather industry and the like.
Image retrieval enables efficient querying and management of image libraries, which refers to retrieving images from large-scale image databases that are relevant to text queries or visual queries. Currently, text-based image retrieval (TBIR), content-based image retrieval (CBIR), and semantic-based image retrieval (SBIR) are mainly used for image retrieval. The image retrieval based on the text mainly uses the text to describe the characteristics of the image, and then the image retrieval is carried out through text matching. Text-based retrieval techniques have been developed and matured at present, such as probabilistic methods, Page-Rank methods, summarization methods, location methods, classification or part-of-speech tagging methods, clustering methods, etc. (Cheng A, Friedman E.Manipulaty of Page Rank under systematic strategies [ J ]. NetEcon, 2006.). The content-based image retrieval technology is an image retrieval technology for inquiring and analyzing the content of an image, such as the shape, texture and other low-level features of the image. The image features are extracted by mathematically describing the visual content of the image, and the mathematical description of these low-level features is used to reflect the visual content of the image itself. The image retrieval technology based on the semantics is different from CBIR in that SBIR is an important method and idea for solving the semantic gap, not only takes low-level visual features into consideration, but also takes high-level features of images into consideration, such as image information in aspects of scenes, emotion, spatial relationship and the like. In 2012, Krizhevsky et al (Krizhevsky a, Sutskever I, Hinton G e. ImageNet classification with deep connected neural networks [ c ]// Advances in neural information processing systems,2012: 1097-. The deep learning algorithm, particularly the convolutional neural network, has the best retrieval effect, and utilizes the combination of a plurality of pooling layers and convolutional layers to obtain the visual characteristics of the image, and is combined with a feedback and classification technology to realize a better retrieval result.
The problems faced at present are that the precision of image retrieval needs to be further improved, and the intellectualization and diversification of retrieval methods are increased. How to quickly, efficiently and accurately retrieve images required by users is an important topic in the field of image retrieval.
Disclosure of Invention
The invention aims to provide a twin network image retrieval method based on a second-order attention mechanism, which solves the problem of low image retrieval precision in the prior art.
The invention adopts the technical scheme that a twin network image retrieval method based on a second-order attention mechanism comprises the following steps:
step 1, performing background subtraction processing on a query image and a training image;
step 2, adding a second-order attention mechanism after the convolution layer of the convolution neural network to obtain a second-order attention convolution neural network, wherein the second-order attention mechanism is used for processing the output of the convolution layer to obtain the input of the next layer;
step 3, inputting the query image and the training image processed in the step 1 into a second-order attention convolution neural network respectively for feature extraction to obtain a query image feature and a training image feature;
step 4, carrying out global average pooling and L2 normalization on the query image features and the training image features to obtain a query image descriptor D2 and training image descriptors D2 and Ds1 denotes a descriptor for each training diagram, s 1 … n;
step 5, carrying out similarity measurement on the query image descriptors and the training image descriptors, and sequencing the training image descriptors according to the similarity to obtain a sequencing result;
and 6, rearranging the sequencing result, and retrieving to obtain a training image most similar to the query image.
The invention is also characterized in that:
the convolutional neural network in step 2 comprises 2 × 3 pooling layers, 2 × 2 fully connected layers, and 3 × 1 convolutional layers, and the size of the filter in the convolutional layers is 5 × 5.
The specific process of processing the output of the convolutional layer in the step 2 to obtain the input of the next layer is as follows:
step a, representing the C-dimensional characteristic diagram with the size of H multiplied by W as a characteristic diagram F ═ F1,…,fc]The size is H multiplied by W multiplied by C; mapping the features to fcReshaped to a feature matrix X with dimension C and feature S ═ W × H, then the covariance matrix is calculated by:
Figure BDA0003024077340000031
in the above formula, the first and second carbon atoms are,
Figure BDA0003024077340000041
i is an sxs matrix, and 1 is an sxs unit matrix;
b, carrying out covariance normalization on the covariance matrix sigma to obtain:
∑=U∧UT (2);
in the above formula, U is an orthogonal matrix, Λ ═ diag (λ)1,…,λC) Is a diagonal matrix with eigenvalues;
step c, carrying out convolution normalization on the covariance matrix sigma processed in the step b, and converting the covariance matrix sigma into a power of the characteristic value:
Figure BDA0003024077340000042
in the above formula, alpha is positive real number and lambadaα=diag(λα 1,…,λα C);
Step d, making
Figure BDA0003024077340000048
By shrinking
Figure BDA0003024077340000046
Obtaining channel statistics z, and z belongs to RC×1(ii) a The c-th dimension of the channel statistic z is then calculated as:
Figure BDA0003024077340000043
in the above formula, HGCP(. -) represents a global covariance pool function;
step e, applying gate control mechanism to the statistic value z of the channel ccConverting to obtain the scaling factor w in the channel cc
wc=f(wUδ(WDzc)) (5);
In the above formula, WD、WUF (-) and delta (-) represent functions of sigmoid and RELU as weight values of convolutional layers;
using constriction in channel cLet factor wcMapping f to featurescAdjusting to obtain a characteristic diagram
Figure BDA0003024077340000049
The next layer of input is obtained:
Figure BDA0003024077340000044
the specific process of the step 5 is as follows: computing a query image descriptor D2 and each training image descriptor Ds1 Euclidean distance ds(xsY), wherein D2 ═ (y1 … yn), Ds1=(xs1…xsn):
Figure BDA0003024077340000051
According to Euclidean distance ds(xsAnd y) sequencing the training images from low to high to obtain a sequencing result.
The specific process of the step 6 is as follows: selecting a plurality of training images ranked at the top in the sequencing result, calculating the average vector of the characteristic vectors, rearranging the result according to the average vector, and retrieving to obtain the training image most similar to the query image.
The invention has the beneficial effects that:
the invention relates to a twin network image retrieval method based on a second-order attention mechanism, which is characterized in that the second-order attention mechanism is added in the convolution process to strengthen second-order spatial information, and feature mapping is reweighed, so that the prominent image position is emphasized and then used for description, and the local and global performances of an image descriptor can be improved; the method can improve the retrieval precision, save the retrieval time and realize the aims of rapidness, high efficiency and accuracy.
Drawings
FIG. 1 is a flow chart of a twin network image retrieval method based on a second order attention mechanism according to the present invention;
FIG. 2 is a detailed flowchart of a twin network image retrieval method based on a second-order attention mechanism according to the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
A twin network image retrieval method based on a second-order attention mechanism is disclosed, as shown in FIG. 1 and FIG. 2, and specifically comprises the following steps:
step 1, selecting a data set, and performing background subtraction processing on a query image and a training image in the data set by adopting a background subtraction algorithm;
the data set selected in this example is CIFAR-10. The training pictures comprise ten categories, wherein 50000 training pictures and 10000 training pictures are total. The CIFAR-10 contains real objects in the real world, not only is the noise large, but also the proportion and the characteristics of the objects are different, and great difficulty is brought to identification. The background subtraction algorithm mainly uses a background subtraction subtractor MOG2 algorithm in python opencv, the method defaults to use the previous 120 frames of images for modeling, and a probability foreground segmentation algorithm is used, namely a Bayesian inference method is used for identifying whether an object is a foreground or not; the algorithm compares that a new observed object in the image has higher weight than an old observed object by a self-adaptive method, so that the algorithm can adapt to the change of illumination; some of the morphological operations such as the close and open operations are used to remove unwanted noise.
And 2, adding a second-order attention mechanism after the convolutional layers of the convolutional neural network, so that the dependency between the characteristics of each convolutional layer can be improved, and obtaining the second-order attention convolutional neural network. By sequentially adding a second-order attention mechanism module behind each convolution layer for experiment, finding out the convolution layer most suitable for adding the second-order attention mechanism according to the comparison result;
specifically, the convolutional neural network includes 2 × 3 pooling layers, 2 × 2 fully-connected layers, and 3 × 1 convolutional layers, in which 32, and 64 filters are respectively included, and the size of the filter in the convolutional layer is 5 × 5. Obtaining corresponding feature mapping after convolution processing of the query image and the training image, and continuously updating weights in the network by using a Loss function so as to achieve the optimal training effect;
the specific process of processing the output of the convolutional layer to obtain the input of the next layer is as follows:
step a, representing the C-dimensional characteristic diagram with the size of H multiplied by W as a characteristic diagram F ═ F1,…,fc]The size is H multiplied by W multiplied by C; mapping the features to fcReshaped to a feature matrix X with dimension C and S ═ W × H, then the covariance matrix is calculated by:
Figure BDA0003024077340000061
in the above formula, the first and second carbon atoms are,
Figure BDA0003024077340000062
i is an sxs matrix, and 1 is an sxs unit matrix;
step b, as the covariance matrix sigma is symmetrical and semi-positive, the covariance matrix sigma has eigenvalue decomposition (EIG); and carrying out covariance normalization on the covariance matrix sigma to obtain:
∑=U∧UT (2);
in the above formula, U is an orthogonal matrix, Λ ═ diag (λ)1,…,λC) Is a diagonal matrix with eigenvalues;
step c, carrying out convolution normalization on the covariance matrix sigma processed in the step b, and converting the covariance matrix sigma into a power of the characteristic value:
Figure BDA0003024077340000071
in the above formula, alpha is positive real number and lambadaα=diag(λα 1,…,λα C) (ii) a When α is 1, no normalization is performed; when alpha is<1, it shrinks the characteristic value larger than 1.0 non-linearly, and records the characteristic value smaller than 1.0; according to the data, alpha is 1/2 which has the best effect; in the present embodiment, α — 1/2 is set;
and d, taking the normalized covariance matrix as a channel descriptor through global covariance pooling. In particular, make
Figure BDA0003024077340000072
By shrinking
Figure BDA0003024077340000075
Obtaining a channel statistic value z, wherein z belongs to RC×1Then the statistical value z of the channel ccThe calculation method is as follows:
Figure BDA0003024077340000073
in the above formula, HGCP(. -) represents a global covariance pool function;
step e, applying gate control mechanism to the statistic value z of the channel ccConverting to obtain the scaling factor w in the channel cc
wc=f(wUδ(WDzc)) (5);
In the above formula, WD、WUSetting the channel dimension of the features as C/r and C respectively for the weight of the convolution layer; f (-) and delta (-) denote sigmoid and RELU functions;
using a scaling factor w in channel ccMapping f to featurescAdjusting to obtain a characteristic diagram
Figure BDA0003024077340000074
I.e. the input to the next layer:
Figure BDA0003024077340000081
step 3, inputting the query image and the training image processed in the step 1 into a second-order attention convolution neural network respectively for feature extraction to obtain a query image feature and a training image feature;
step 4, carrying out global leveling on the query image features and the training image featuresAfter pooling and L2 normalization, a softmax function of the query image features and the training image features is given for processing to obtain a query image descriptor D2 and a training image descriptor Ds1,Ds1 denotes a descriptor for each training diagram, s 1 … n;
step 5, carrying out similarity measurement on the query image descriptor and each training image descriptor, and sequencing the training image descriptors according to the similarity to obtain a sequencing result;
specifically, the similarity measure is calculated by computing the query image descriptor D2 and each training image descriptor Ds1 Euclidean distance ds(xsY), wherein D2 ═ (y1 … yn), Ds1=(xs1…xsn):
Figure BDA0003024077340000082
According to Euclidean distance ds(xsY) ordering the training images from low to high (similarity from large to small), Euclidean distance ds(xsAnd y) is smaller, the similarity is larger, namely the training image which is ranked more front is more similar to the query image, and the ranking result is obtained.
And 6, rearranging the sequencing result, and retrieving to obtain a training image most similar to the query image.
Specifically, several training images ranked at the top in the ranking result are selected, the average vector of the feature vectors of the training images is calculated, the result is rearranged according to the average vector, and the training image most similar to the query image is obtained through retrieval.
Through the mode, the twin network image retrieval method based on the second-order attention mechanism is characterized in that the second-order attention mechanism is added in the convolution process to strengthen second-order spatial information, and feature mapping is reweighed, so that the prominent image position is emphasized and then used for description, and the local and global performances of an image descriptor can be improved; the method can improve the retrieval precision, save the retrieval time and realize the aims of rapidness, high efficiency and accuracy.

Claims (5)

1. A twin network image retrieval method based on a second-order attention mechanism is characterized by comprising the following steps:
step 1, performing background subtraction processing on a query image and a training image;
step 2, adding a second-order attention mechanism after the convolution layer of the convolution neural network to obtain a second-order attention convolution neural network, wherein the second-order attention mechanism is used for processing the output of the convolution layer to obtain the input of the next layer;
step 3, inputting the query image and the training image processed in the step 1 into a second-order attention convolution neural network respectively for feature extraction to obtain a query image feature and a training image feature;
step 4, carrying out global average pooling and L2 normalization on the query image features and the training image features to obtain a query image descriptor D2 and training image descriptors D2 and Ds1 denotes a descriptor for each training diagram, s 1 … n;
step 5, carrying out similarity measurement on the query image descriptors and the training image descriptors, and sorting the training image descriptors according to the similarity to obtain a sorting result;
and 6, rearranging the sequencing result, and retrieving to obtain a training image most similar to the query image.
2. The method according to claim 1, wherein the convolutional neural network in step 2 comprises 2 x 3 pooling layers, 2 x 2 fully connected layers, and 3 x 1 convolutional layers, and the size of the filter in the convolutional layers is 5 x 5.
3. The twin network image retrieval method based on the second-order attention mechanism as claimed in claim 1, wherein the specific process of processing the output of the convolutional layer in step 2 to obtain the input of the next layer is as follows:
step a, C-dimensional features with the size of H multiplied by WThe diagram is a characteristic diagram F ═ F1,…,fc]The size is H multiplied by W multiplied by C; mapping the features to fcReshaped to a feature matrix X with dimension C and feature S ═ W × H, then the covariance matrix is calculated by:
Figure FDA0003024077330000021
in the above formula, the first and second carbon atoms are,
Figure FDA0003024077330000022
i is an sxs matrix, and 1 is an sxs unit matrix;
b, carrying out covariance normalization on the covariance matrix sigma to obtain:
∑=U∧UT (2);
in the above formula, U is an orthogonal matrix, Λ ═ diag (λ)1,…,λC) Is a diagonal matrix with eigenvalues;
step c, carrying out convolution normalization on the covariance matrix sigma processed in the step b, and converting the covariance matrix sigma into a power of the characteristic value:
Figure FDA0003024077330000023
in the above formula, alpha is positive real number and lambadaα=diag(λα 1,…,λα C);
Step d, making
Figure FDA0003024077330000024
By shrinking
Figure FDA0003024077330000025
Obtaining channel statistics z, and z belongs to RC×1(ii) a The c-th dimension of the channel statistic z is then calculated as:
Figure FDA0003024077330000026
in the above formula, HGCP(. -) represents a global covariance pool function;
step e, applying gate control mechanism to the statistic value z of the channel ccConverting to obtain the scaling factor w in the channel cc
wc=f(wUδ(WDzc)) (5);
In the above formula, WD、WUF (-) and delta (-) represent functions of sigmoid and RELU as weight values of convolutional layers;
using a scaling factor w in channel ccMapping f to featurescAdjusting to obtain a characteristic diagram
Figure FDA0003024077330000027
The next layer of input is obtained:
Figure FDA0003024077330000031
4. the twin network image retrieval method based on the second-order attention mechanism as claimed in claim 1, wherein the specific process of step 5 is as follows: computing the query image descriptor D2 and each training image descriptor Ds1 Euclidean distance ds(xsY), wherein D2 ═ (y1 … yn), Ds1=(xs1…xsn):
Figure FDA0003024077330000032
According to the Euclidean distance ds(xsAnd y) sequencing the training images from low to high to obtain a sequencing result.
5. The twin network image retrieval method based on the second-order attention mechanism as claimed in claim 1, wherein the specific process of step 6 is as follows: selecting a plurality of training images ranked at the top in the sequencing result, calculating the average vector of the characteristic vectors, rearranging the result according to the average vector, and retrieving to obtain the training image most similar to the query image.
CN202110410902.2A 2021-04-16 2021-04-16 Twin network image retrieval method based on second-order attention mechanism Pending CN113190706A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110410902.2A CN113190706A (en) 2021-04-16 2021-04-16 Twin network image retrieval method based on second-order attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110410902.2A CN113190706A (en) 2021-04-16 2021-04-16 Twin network image retrieval method based on second-order attention mechanism

Publications (1)

Publication Number Publication Date
CN113190706A true CN113190706A (en) 2021-07-30

Family

ID=76977188

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110410902.2A Pending CN113190706A (en) 2021-04-16 2021-04-16 Twin network image retrieval method based on second-order attention mechanism

Country Status (1)

Country Link
CN (1) CN113190706A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113920587A (en) * 2021-11-01 2022-01-11 哈尔滨理工大学 Human body posture estimation method based on convolutional neural network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120070075A1 (en) * 2010-09-17 2012-03-22 Honeywell International Inc. Image processing based on visual attention and reduced search based generated regions of interest
CN111198964A (en) * 2020-01-10 2020-05-26 中国科学院自动化研究所 Image retrieval method and system
CN111354017A (en) * 2020-03-04 2020-06-30 江南大学 Target tracking method based on twin neural network and parallel attention module

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120070075A1 (en) * 2010-09-17 2012-03-22 Honeywell International Inc. Image processing based on visual attention and reduced search based generated regions of interest
CN111198964A (en) * 2020-01-10 2020-05-26 中国科学院自动化研究所 Image retrieval method and system
CN111354017A (en) * 2020-03-04 2020-06-30 江南大学 Target tracking method based on twin neural network and parallel attention module

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
TAO DAI等: "Second-order Attention Network for Single Image Super-Resolution", 《2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 *
武玉伟: "《深度学习基础与应用》", 30 April 2020, 北京理工大学出版社 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113920587A (en) * 2021-11-01 2022-01-11 哈尔滨理工大学 Human body posture estimation method based on convolutional neural network

Similar Documents

Publication Publication Date Title
Wang et al. Annotating images by mining image search results
Wang et al. Annosearch: Image auto-annotation by search
CN106033426B (en) Image retrieval method based on latent semantic minimum hash
Wang et al. Retrieval-based face annotation by weak label regularized local coordinate coding
Sun et al. Scene image classification method based on Alex-Net model
Mishra et al. Image mining in the context of content based image retrieval: a perspective
CN111723692B (en) Near-repetitive video detection method based on label features of convolutional neural network semantic classification
CN114461839A (en) Multi-mode pre-training-based similar picture retrieval method and device and electronic equipment
CN110765285A (en) Multimedia information content control method and system based on visual characteristics
Nezamabadi-pour et al. Concept learning by fuzzy k-NN classification and relevance feedback for efficient image retrieval
Yao Key frame extraction method of music and dance video based on multicore learning feature fusion
Kumar et al. Content based video retrieval using deep learning feature extraction by modified VGG_16
CN113190706A (en) Twin network image retrieval method based on second-order attention mechanism
Liu et al. Bit reduction for locality-sensitive hashing
Min et al. Overview of content-based image retrieval with high-level semantics
Zhang et al. Improved image retrieval algorithm of GoogLeNet neural network
Barz et al. Content-based image retrieval and the semantic gap in the deep learning era
Morsillo et al. Mining the web for visual concepts
Wu et al. Deep Hybrid Neural Network With Attention Mechanism for Video Hash Retrieval Method
Ouni A machine learning approach for image retrieval tasks
Zare Chahooki et al. Bridging the semantic gap for automatic image annotation by learning the manifold space.
Ding et al. A bag-of-feature model for video semantic annotation
Huo et al. Fused feature encoding in convolutional neural network
Maihami et al. Color Features and Color Spaces Applications to the Automatic Image Annotation
He et al. Construction of user preference profile in a personalized image retrieval

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210730

RJ01 Rejection of invention patent application after publication