CN113190706A - Twin network image retrieval method based on second-order attention mechanism - Google Patents
Twin network image retrieval method based on second-order attention mechanism Download PDFInfo
- Publication number
- CN113190706A CN113190706A CN202110410902.2A CN202110410902A CN113190706A CN 113190706 A CN113190706 A CN 113190706A CN 202110410902 A CN202110410902 A CN 202110410902A CN 113190706 A CN113190706 A CN 113190706A
- Authority
- CN
- China
- Prior art keywords
- image
- training
- training image
- query image
- order attention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000007246 mechanism Effects 0.000 title claims abstract description 28
- 238000012549 training Methods 0.000 claims abstract description 55
- 238000012163 sequencing technique Methods 0.000 claims abstract description 14
- 238000012545 processing Methods 0.000 claims abstract description 12
- 238000013528 artificial neural network Methods 0.000 claims abstract description 11
- 238000010606 normalization Methods 0.000 claims abstract description 11
- 238000011176 pooling Methods 0.000 claims abstract description 9
- 238000000605 extraction Methods 0.000 claims abstract description 4
- 238000005259 measurement Methods 0.000 claims abstract description 4
- 239000011159 matrix material Substances 0.000 claims description 30
- 238000010586 diagram Methods 0.000 claims description 12
- 238000013507 mapping Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 239000013598 vector Substances 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 8
- 238000013527 convolutional neural network Methods 0.000 claims description 6
- 125000004432 carbon atom Chemical group C* 0.000 claims description 3
- 239000004576 sand Substances 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 3
- 101100476202 Caenorhabditis elegans mog-2 gene Proteins 0.000 description 1
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000010985 leather Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 239000004753 textile Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Library & Information Science (AREA)
- Probability & Statistics with Applications (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a twin network image retrieval method based on a second-order attention mechanism, which comprises the following steps of: performing background subtraction processing on the query image and the training image; adding a second-order attention mechanism after the convolution layer of the convolution neural network to obtain a second-order attention convolution neural network; respectively inputting the processed query image and the processed training image into a second-order attention convolution neural network for feature extraction to obtain a query image feature and a training image feature; carrying out global average pooling and L2 normalization on the query image features and the training image features to obtain query image descriptors and training image descriptors; similarity measurement is carried out on the query image descriptors and the training image descriptors, and the training image descriptors are ranked according to the similarity to obtain a ranking result; and rearranging the sequencing result, and retrieving to obtain a training image most similar to the query image. The method can improve the retrieval precision, save the retrieval time and realize the aims of rapidness, high efficiency and accuracy.
Description
Technical Field
The invention belongs to the technical field of image processing methods, and relates to a twin network image retrieval method based on a second-order attention mechanism.
Background
In the internet era, heterogeneous data such as images, videos, audios, texts and the like are increasing at an alarming rate every day, particularly with the popularity of social networking sites such as Flickr, Facebook and the like. For example, Facebook registers more than 10 hundred million users, uploading more than 10 hundred million pictures per month; the number of pictures uploaded by the users in 2015 year of the Flickr picture social network site reaches 7.28 hundred million, and about 200 million pictures are uploaded by the users on average each day; 286 hundred million pictures are stored in the back end system of the Taobao network of the largest electronic commerce system in China. For these massive pictures containing rich visual information, how to conveniently, quickly and accurately query and retrieve the images needed or interested by users in these vast image libraries becomes a hotspot of research in the field of multimedia information retrieval. The image retrieval method based on the content gives full play to the advantage that a computer is longer than a computer for processing repeated tasks, and frees people from manual labeling which needs to consume a large amount of manpower, material resources and financial resources. With the development of the ten years, the content-based image retrieval technology has been widely applied to the aspects of life such as search engines, electronic commerce, medicine, textile industry, leather industry and the like.
Image retrieval enables efficient querying and management of image libraries, which refers to retrieving images from large-scale image databases that are relevant to text queries or visual queries. Currently, text-based image retrieval (TBIR), content-based image retrieval (CBIR), and semantic-based image retrieval (SBIR) are mainly used for image retrieval. The image retrieval based on the text mainly uses the text to describe the characteristics of the image, and then the image retrieval is carried out through text matching. Text-based retrieval techniques have been developed and matured at present, such as probabilistic methods, Page-Rank methods, summarization methods, location methods, classification or part-of-speech tagging methods, clustering methods, etc. (Cheng A, Friedman E.Manipulaty of Page Rank under systematic strategies [ J ]. NetEcon, 2006.). The content-based image retrieval technology is an image retrieval technology for inquiring and analyzing the content of an image, such as the shape, texture and other low-level features of the image. The image features are extracted by mathematically describing the visual content of the image, and the mathematical description of these low-level features is used to reflect the visual content of the image itself. The image retrieval technology based on the semantics is different from CBIR in that SBIR is an important method and idea for solving the semantic gap, not only takes low-level visual features into consideration, but also takes high-level features of images into consideration, such as image information in aspects of scenes, emotion, spatial relationship and the like. In 2012, Krizhevsky et al (Krizhevsky a, Sutskever I, Hinton G e. ImageNet classification with deep connected neural networks [ c ]// Advances in neural information processing systems,2012: 1097-. The deep learning algorithm, particularly the convolutional neural network, has the best retrieval effect, and utilizes the combination of a plurality of pooling layers and convolutional layers to obtain the visual characteristics of the image, and is combined with a feedback and classification technology to realize a better retrieval result.
The problems faced at present are that the precision of image retrieval needs to be further improved, and the intellectualization and diversification of retrieval methods are increased. How to quickly, efficiently and accurately retrieve images required by users is an important topic in the field of image retrieval.
Disclosure of Invention
The invention aims to provide a twin network image retrieval method based on a second-order attention mechanism, which solves the problem of low image retrieval precision in the prior art.
The invention adopts the technical scheme that a twin network image retrieval method based on a second-order attention mechanism comprises the following steps:
step 1, performing background subtraction processing on a query image and a training image;
step 2, adding a second-order attention mechanism after the convolution layer of the convolution neural network to obtain a second-order attention convolution neural network, wherein the second-order attention mechanism is used for processing the output of the convolution layer to obtain the input of the next layer;
step 3, inputting the query image and the training image processed in the step 1 into a second-order attention convolution neural network respectively for feature extraction to obtain a query image feature and a training image feature;
step 4, carrying out global average pooling and L2 normalization on the query image features and the training image features to obtain a query image descriptor D2 and training image descriptors D2 and Ds1 denotes a descriptor for each training diagram, s 1 … n;
step 5, carrying out similarity measurement on the query image descriptors and the training image descriptors, and sequencing the training image descriptors according to the similarity to obtain a sequencing result;
and 6, rearranging the sequencing result, and retrieving to obtain a training image most similar to the query image.
The invention is also characterized in that:
the convolutional neural network in step 2 comprises 2 × 3 pooling layers, 2 × 2 fully connected layers, and 3 × 1 convolutional layers, and the size of the filter in the convolutional layers is 5 × 5.
The specific process of processing the output of the convolutional layer in the step 2 to obtain the input of the next layer is as follows:
step a, representing the C-dimensional characteristic diagram with the size of H multiplied by W as a characteristic diagram F ═ F1,…,fc]The size is H multiplied by W multiplied by C; mapping the features to fcReshaped to a feature matrix X with dimension C and feature S ═ W × H, then the covariance matrix is calculated by:
in the above formula, the first and second carbon atoms are,i is an sxs matrix, and 1 is an sxs unit matrix;
b, carrying out covariance normalization on the covariance matrix sigma to obtain:
∑=U∧UT (2);
in the above formula, U is an orthogonal matrix, Λ ═ diag (λ)1,…,λC) Is a diagonal matrix with eigenvalues;
step c, carrying out convolution normalization on the covariance matrix sigma processed in the step b, and converting the covariance matrix sigma into a power of the characteristic value:
in the above formula, alpha is positive real number and lambadaα=diag(λα 1,…,λα C);
Step d, makingBy shrinkingObtaining channel statistics z, and z belongs to RC×1(ii) a The c-th dimension of the channel statistic z is then calculated as:
in the above formula, HGCP(. -) represents a global covariance pool function;
step e, applying gate control mechanism to the statistic value z of the channel ccConverting to obtain the scaling factor w in the channel cc:
wc=f(wUδ(WDzc)) (5);
In the above formula, WD、WUF (-) and delta (-) represent functions of sigmoid and RELU as weight values of convolutional layers;
using constriction in channel cLet factor wcMapping f to featurescAdjusting to obtain a characteristic diagramThe next layer of input is obtained:
the specific process of the step 5 is as follows: computing a query image descriptor D2 and each training image descriptor Ds1 Euclidean distance ds(xsY), wherein D2 ═ (y1 … yn), Ds1=(xs1…xsn):
According to Euclidean distance ds(xsAnd y) sequencing the training images from low to high to obtain a sequencing result.
The specific process of the step 6 is as follows: selecting a plurality of training images ranked at the top in the sequencing result, calculating the average vector of the characteristic vectors, rearranging the result according to the average vector, and retrieving to obtain the training image most similar to the query image.
The invention has the beneficial effects that:
the invention relates to a twin network image retrieval method based on a second-order attention mechanism, which is characterized in that the second-order attention mechanism is added in the convolution process to strengthen second-order spatial information, and feature mapping is reweighed, so that the prominent image position is emphasized and then used for description, and the local and global performances of an image descriptor can be improved; the method can improve the retrieval precision, save the retrieval time and realize the aims of rapidness, high efficiency and accuracy.
Drawings
FIG. 1 is a flow chart of a twin network image retrieval method based on a second order attention mechanism according to the present invention;
FIG. 2 is a detailed flowchart of a twin network image retrieval method based on a second-order attention mechanism according to the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
A twin network image retrieval method based on a second-order attention mechanism is disclosed, as shown in FIG. 1 and FIG. 2, and specifically comprises the following steps:
step 1, selecting a data set, and performing background subtraction processing on a query image and a training image in the data set by adopting a background subtraction algorithm;
the data set selected in this example is CIFAR-10. The training pictures comprise ten categories, wherein 50000 training pictures and 10000 training pictures are total. The CIFAR-10 contains real objects in the real world, not only is the noise large, but also the proportion and the characteristics of the objects are different, and great difficulty is brought to identification. The background subtraction algorithm mainly uses a background subtraction subtractor MOG2 algorithm in python opencv, the method defaults to use the previous 120 frames of images for modeling, and a probability foreground segmentation algorithm is used, namely a Bayesian inference method is used for identifying whether an object is a foreground or not; the algorithm compares that a new observed object in the image has higher weight than an old observed object by a self-adaptive method, so that the algorithm can adapt to the change of illumination; some of the morphological operations such as the close and open operations are used to remove unwanted noise.
And 2, adding a second-order attention mechanism after the convolutional layers of the convolutional neural network, so that the dependency between the characteristics of each convolutional layer can be improved, and obtaining the second-order attention convolutional neural network. By sequentially adding a second-order attention mechanism module behind each convolution layer for experiment, finding out the convolution layer most suitable for adding the second-order attention mechanism according to the comparison result;
specifically, the convolutional neural network includes 2 × 3 pooling layers, 2 × 2 fully-connected layers, and 3 × 1 convolutional layers, in which 32, and 64 filters are respectively included, and the size of the filter in the convolutional layer is 5 × 5. Obtaining corresponding feature mapping after convolution processing of the query image and the training image, and continuously updating weights in the network by using a Loss function so as to achieve the optimal training effect;
the specific process of processing the output of the convolutional layer to obtain the input of the next layer is as follows:
step a, representing the C-dimensional characteristic diagram with the size of H multiplied by W as a characteristic diagram F ═ F1,…,fc]The size is H multiplied by W multiplied by C; mapping the features to fcReshaped to a feature matrix X with dimension C and S ═ W × H, then the covariance matrix is calculated by:
in the above formula, the first and second carbon atoms are,i is an sxs matrix, and 1 is an sxs unit matrix;
step b, as the covariance matrix sigma is symmetrical and semi-positive, the covariance matrix sigma has eigenvalue decomposition (EIG); and carrying out covariance normalization on the covariance matrix sigma to obtain:
∑=U∧UT (2);
in the above formula, U is an orthogonal matrix, Λ ═ diag (λ)1,…,λC) Is a diagonal matrix with eigenvalues;
step c, carrying out convolution normalization on the covariance matrix sigma processed in the step b, and converting the covariance matrix sigma into a power of the characteristic value:
in the above formula, alpha is positive real number and lambadaα=diag(λα 1,…,λα C) (ii) a When α is 1, no normalization is performed; when alpha is<1, it shrinks the characteristic value larger than 1.0 non-linearly, and records the characteristic value smaller than 1.0; according to the data, alpha is 1/2 which has the best effect; in the present embodiment, α — 1/2 is set;
and d, taking the normalized covariance matrix as a channel descriptor through global covariance pooling. In particular, makeBy shrinkingObtaining a channel statistic value z, wherein z belongs to RC×1Then the statistical value z of the channel ccThe calculation method is as follows:
in the above formula, HGCP(. -) represents a global covariance pool function;
step e, applying gate control mechanism to the statistic value z of the channel ccConverting to obtain the scaling factor w in the channel cc:
wc=f(wUδ(WDzc)) (5);
In the above formula, WD、WUSetting the channel dimension of the features as C/r and C respectively for the weight of the convolution layer; f (-) and delta (-) denote sigmoid and RELU functions;
using a scaling factor w in channel ccMapping f to featurescAdjusting to obtain a characteristic diagramI.e. the input to the next layer:
step 3, inputting the query image and the training image processed in the step 1 into a second-order attention convolution neural network respectively for feature extraction to obtain a query image feature and a training image feature;
step 4, carrying out global leveling on the query image features and the training image featuresAfter pooling and L2 normalization, a softmax function of the query image features and the training image features is given for processing to obtain a query image descriptor D2 and a training image descriptor Ds1,Ds1 denotes a descriptor for each training diagram, s 1 … n;
step 5, carrying out similarity measurement on the query image descriptor and each training image descriptor, and sequencing the training image descriptors according to the similarity to obtain a sequencing result;
specifically, the similarity measure is calculated by computing the query image descriptor D2 and each training image descriptor Ds1 Euclidean distance ds(xsY), wherein D2 ═ (y1 … yn), Ds1=(xs1…xsn):
According to Euclidean distance ds(xsY) ordering the training images from low to high (similarity from large to small), Euclidean distance ds(xsAnd y) is smaller, the similarity is larger, namely the training image which is ranked more front is more similar to the query image, and the ranking result is obtained.
And 6, rearranging the sequencing result, and retrieving to obtain a training image most similar to the query image.
Specifically, several training images ranked at the top in the ranking result are selected, the average vector of the feature vectors of the training images is calculated, the result is rearranged according to the average vector, and the training image most similar to the query image is obtained through retrieval.
Through the mode, the twin network image retrieval method based on the second-order attention mechanism is characterized in that the second-order attention mechanism is added in the convolution process to strengthen second-order spatial information, and feature mapping is reweighed, so that the prominent image position is emphasized and then used for description, and the local and global performances of an image descriptor can be improved; the method can improve the retrieval precision, save the retrieval time and realize the aims of rapidness, high efficiency and accuracy.
Claims (5)
1. A twin network image retrieval method based on a second-order attention mechanism is characterized by comprising the following steps:
step 1, performing background subtraction processing on a query image and a training image;
step 2, adding a second-order attention mechanism after the convolution layer of the convolution neural network to obtain a second-order attention convolution neural network, wherein the second-order attention mechanism is used for processing the output of the convolution layer to obtain the input of the next layer;
step 3, inputting the query image and the training image processed in the step 1 into a second-order attention convolution neural network respectively for feature extraction to obtain a query image feature and a training image feature;
step 4, carrying out global average pooling and L2 normalization on the query image features and the training image features to obtain a query image descriptor D2 and training image descriptors D2 and Ds1 denotes a descriptor for each training diagram, s 1 … n;
step 5, carrying out similarity measurement on the query image descriptors and the training image descriptors, and sorting the training image descriptors according to the similarity to obtain a sorting result;
and 6, rearranging the sequencing result, and retrieving to obtain a training image most similar to the query image.
2. The method according to claim 1, wherein the convolutional neural network in step 2 comprises 2 x 3 pooling layers, 2 x 2 fully connected layers, and 3 x 1 convolutional layers, and the size of the filter in the convolutional layers is 5 x 5.
3. The twin network image retrieval method based on the second-order attention mechanism as claimed in claim 1, wherein the specific process of processing the output of the convolutional layer in step 2 to obtain the input of the next layer is as follows:
step a, C-dimensional features with the size of H multiplied by WThe diagram is a characteristic diagram F ═ F1,…,fc]The size is H multiplied by W multiplied by C; mapping the features to fcReshaped to a feature matrix X with dimension C and feature S ═ W × H, then the covariance matrix is calculated by:
in the above formula, the first and second carbon atoms are,i is an sxs matrix, and 1 is an sxs unit matrix;
b, carrying out covariance normalization on the covariance matrix sigma to obtain:
∑=U∧UT (2);
in the above formula, U is an orthogonal matrix, Λ ═ diag (λ)1,…,λC) Is a diagonal matrix with eigenvalues;
step c, carrying out convolution normalization on the covariance matrix sigma processed in the step b, and converting the covariance matrix sigma into a power of the characteristic value:
in the above formula, alpha is positive real number and lambadaα=diag(λα 1,…,λα C);
Step d, makingBy shrinkingObtaining channel statistics z, and z belongs to RC×1(ii) a The c-th dimension of the channel statistic z is then calculated as:
in the above formula, HGCP(. -) represents a global covariance pool function;
step e, applying gate control mechanism to the statistic value z of the channel ccConverting to obtain the scaling factor w in the channel cc:
wc=f(wUδ(WDzc)) (5);
In the above formula, WD、WUF (-) and delta (-) represent functions of sigmoid and RELU as weight values of convolutional layers;
using a scaling factor w in channel ccMapping f to featurescAdjusting to obtain a characteristic diagramThe next layer of input is obtained:
4. the twin network image retrieval method based on the second-order attention mechanism as claimed in claim 1, wherein the specific process of step 5 is as follows: computing the query image descriptor D2 and each training image descriptor Ds1 Euclidean distance ds(xsY), wherein D2 ═ (y1 … yn), Ds1=(xs1…xsn):
According to the Euclidean distance ds(xsAnd y) sequencing the training images from low to high to obtain a sequencing result.
5. The twin network image retrieval method based on the second-order attention mechanism as claimed in claim 1, wherein the specific process of step 6 is as follows: selecting a plurality of training images ranked at the top in the sequencing result, calculating the average vector of the characteristic vectors, rearranging the result according to the average vector, and retrieving to obtain the training image most similar to the query image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110410902.2A CN113190706A (en) | 2021-04-16 | 2021-04-16 | Twin network image retrieval method based on second-order attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110410902.2A CN113190706A (en) | 2021-04-16 | 2021-04-16 | Twin network image retrieval method based on second-order attention mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113190706A true CN113190706A (en) | 2021-07-30 |
Family
ID=76977188
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110410902.2A Pending CN113190706A (en) | 2021-04-16 | 2021-04-16 | Twin network image retrieval method based on second-order attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113190706A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113920587A (en) * | 2021-11-01 | 2022-01-11 | 哈尔滨理工大学 | Human body posture estimation method based on convolutional neural network |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120070075A1 (en) * | 2010-09-17 | 2012-03-22 | Honeywell International Inc. | Image processing based on visual attention and reduced search based generated regions of interest |
CN111198964A (en) * | 2020-01-10 | 2020-05-26 | 中国科学院自动化研究所 | Image retrieval method and system |
CN111354017A (en) * | 2020-03-04 | 2020-06-30 | 江南大学 | Target tracking method based on twin neural network and parallel attention module |
-
2021
- 2021-04-16 CN CN202110410902.2A patent/CN113190706A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120070075A1 (en) * | 2010-09-17 | 2012-03-22 | Honeywell International Inc. | Image processing based on visual attention and reduced search based generated regions of interest |
CN111198964A (en) * | 2020-01-10 | 2020-05-26 | 中国科学院自动化研究所 | Image retrieval method and system |
CN111354017A (en) * | 2020-03-04 | 2020-06-30 | 江南大学 | Target tracking method based on twin neural network and parallel attention module |
Non-Patent Citations (2)
Title |
---|
TAO DAI等: "Second-order Attention Network for Single Image Super-Resolution", 《2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 * |
武玉伟: "《深度学习基础与应用》", 30 April 2020, 北京理工大学出版社 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113920587A (en) * | 2021-11-01 | 2022-01-11 | 哈尔滨理工大学 | Human body posture estimation method based on convolutional neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Annotating images by mining image search results | |
Wang et al. | Annosearch: Image auto-annotation by search | |
CN106033426B (en) | Image retrieval method based on latent semantic minimum hash | |
Wang et al. | Retrieval-based face annotation by weak label regularized local coordinate coding | |
Sun et al. | Scene image classification method based on Alex-Net model | |
Mishra et al. | Image mining in the context of content based image retrieval: a perspective | |
CN111723692B (en) | Near-repetitive video detection method based on label features of convolutional neural network semantic classification | |
CN114461839A (en) | Multi-mode pre-training-based similar picture retrieval method and device and electronic equipment | |
CN110765285A (en) | Multimedia information content control method and system based on visual characteristics | |
Nezamabadi-pour et al. | Concept learning by fuzzy k-NN classification and relevance feedback for efficient image retrieval | |
Yao | Key frame extraction method of music and dance video based on multicore learning feature fusion | |
Kumar et al. | Content based video retrieval using deep learning feature extraction by modified VGG_16 | |
CN113190706A (en) | Twin network image retrieval method based on second-order attention mechanism | |
Liu et al. | Bit reduction for locality-sensitive hashing | |
Min et al. | Overview of content-based image retrieval with high-level semantics | |
Zhang et al. | Improved image retrieval algorithm of GoogLeNet neural network | |
Barz et al. | Content-based image retrieval and the semantic gap in the deep learning era | |
Morsillo et al. | Mining the web for visual concepts | |
Wu et al. | Deep Hybrid Neural Network With Attention Mechanism for Video Hash Retrieval Method | |
Ouni | A machine learning approach for image retrieval tasks | |
Zare Chahooki et al. | Bridging the semantic gap for automatic image annotation by learning the manifold space. | |
Ding et al. | A bag-of-feature model for video semantic annotation | |
Huo et al. | Fused feature encoding in convolutional neural network | |
Maihami et al. | Color Features and Color Spaces Applications to the Automatic Image Annotation | |
He et al. | Construction of user preference profile in a personalized image retrieval |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210730 |
|
RJ01 | Rejection of invention patent application after publication |