CN106649665A - Object-level depth feature aggregation method for image retrieval - Google Patents

Object-level depth feature aggregation method for image retrieval Download PDF

Info

Publication number
CN106649665A
CN106649665A CN201611152148.2A CN201611152148A CN106649665A CN 106649665 A CN106649665 A CN 106649665A CN 201611152148 A CN201611152148 A CN 201611152148A CN 106649665 A CN106649665 A CN 106649665A
Authority
CN
China
Prior art keywords
image
candidate region
feature
convolutional neural
neural networks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611152148.2A
Other languages
Chinese (zh)
Inventor
李豪杰
暴雨
樊鑫
罗钟铉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN201611152148.2A priority Critical patent/CN106649665A/en
Publication of CN106649665A publication Critical patent/CN106649665A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Library & Information Science (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the field of digit media and provides an object-level depth feature aggregation method for image retrieval. First, an unsupervised method is used to generate candidate regions that may contain objects; then corresponding convolution neural network characteristics are extracted; finally, the area features are aggregated to obtain image feature representation having high robustness for image transformation for the use of image retrieval applications. The present invention addresses the lack of geometric transformation and spatial layout invariance of existing models, and the object-based mode is adopted to solve the problems in the prior art; the image features generated by the method have high robustness on image geometric transformation and spatial arrangement transformation; the accuracy of image retrieval is increased; the obtain image is quit compact and concise so that complexity of similarity calculation among images is reduced and retrieval efficiency is increased.

Description

A kind of object level depth characteristic polymerization towards image retrieval
Technical field
The invention belongs to field of digital media, is related to a kind of object level depth characteristic polymerization towards image retrieval.
Background technology
CBIR as computer vision field an important research problem, in past ten years By the extensive concern of Chinese scholars.CBIR is referred to and found out from image data base and query image Similar image.Because the difference of the factor such as angle, distance, environment, can cause similar or identical reference object to exist when shooting Different images have very big change, such as yardstick, visual angle, layout change.Therefore generate one there are various image changes The characteristics of image of high robust, is the key for solving image retrieval problem.
Relative to traditional characteristics of image based on engineer, especially convolutional neural networks have been for the method based on study The powerful ability that Jing shows in image characteristics extraction, takes in the Computer Vision Task such as image classification and target detection Obtained huge success.In image retrieval problem, have at present based on the overall situation and based on the two kinds of convolutional neural networks features in local Method for expressing.
Based on global method, the feature of entire image is directly extracted using convolutional neural networks, as final image Feature.But it is because that convolutional neural networks are mainly encoded to global space information, causes gained feature to lack to image The geometric transformations such as yardstick, rotation, translation and the consistency of space layout change, limit it for highly variable image retrieval Robustness.
For the method based on local, the feature of image local area is extracted using convolutional neural networks, be then polymerized this A little provincial characteristics generate final characteristics of image.Although these methods take into account the local message of image so that feature is relative There is higher robustness to all kinds of changes in global approach, but still there are some defects in these methods.For example using slip The method of window is obtaining image-region (with reference to Yunchao Gong, Liwei Wang, Ruiqi Guo, Svetlana Lazebnik is in European Conference on Computer Vision the 392-407 page article delivered in 2014 " Multi-scale orderless pooling of deep convolutional activation features "), because The vision contents such as color, texture, the edge of image are not accounted for, the region without semantic meaning in a large number is produced, it is poly- after being Conjunction process brings redundancy and noise information.In addition, provincial characteristics merges commonly used maximum pond algorithm (refers to Konda Reddy Mopuri, R.Venkatesh Babu is in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops the 62-70 page article " Object for delivering in 2015 Level deep feature pooling for compact image representation "), because only remaining feature Peak response without consider feature between association, lose bulk information, reduce gained final image feature differentiation Property.
The present invention solves problem above by object-based method.When image-region is generated, using based on content Unsupervised object generation method, i.e., visual informations such as color of image, texture, edges generating image by way of clustering Region.Because same semantic object has certain visual similarity in image, the image-region for so obtaining is very general A part for an object or object can be included in rate.Meanwhile, a width scene image is typically made up of some objects, to this The parsing of a little objects is the key for understanding scene.Therefore the image-region based on content generation is relative to simple sliding window bag Containing more visual informations for having a semantic meaning, its feature interpretation also has higher distinction, while being carried out based on characteristics of objects Fusion, space layout change of the final feature of gained to object in scene also has good robustness.In the mistake of aggregation features Cheng Shi, using VLAD (Vector of Locally Aggregated Descriptors) algorithm, first by image area characteristics Clustered, then count all provincial characteristics in piece image and represent final with the accumulation residual error of its close cluster centre Characteristics of image.Relative to maximum pond algorithm, the method considers local message while association between provincial characteristics to image There is finer portraying so that the final image feature for obtaining has more high robust to the conversion of all kinds of images.
The content of the invention
For the deficiencies in the prior art, the present invention provides a kind of object level depth characteristic polymerization side towards image retrieval Method, the characteristics of image that generation has high robust to Image geometry transform and object space layout change is answered for image retrieval With.
The technical scheme is that:
A kind of object level depth characteristic polymerization towards image retrieval, comprises the following steps:
Step 1, to database in each image using Selective Search algorithms extract candidate region, generate It is likely to contain the image candidate region of object.Described Selective Search (Selective Search for Object Recognition) a kind of algorithm image partition method that to be utilization visual information merged based on delamination area, can Generate that class is independent and high-quality multiple dimensioned candidate region.Relative to sliding window, the feature of the candidate region comprising object is retouched State with higher distinction, while object-based mode can also improve the robustness that fusion feature is converted to space layout.
Step 2, selects the convolutional neural networks structural model being widely adopted, and to convolutional Neural on public database Network carries out pre-training.
Step 3, the convolutional neural networks completed using training extract the feature in all image candidate regions
3.1) image candidate region is zoomed in and out and is filled into after fixed size, as the input of convolutional neural networks;
3.2) using the output of the full articulamentum FC7 of convolutional neural networks as the image candidate region Expressive Features.
Step 4, the Expressive Features of the candidate region obtained to step 3 carry out dimensionality reduction using Principal Component Analysis Algorithm, by it Dimension is reduced to N-dimensional, obtains low-dimensional candidate region feature;Dimensionality reduction can reduce the complexity for calculating afterwards, improve efficiency.
Step 5, the low-dimensional candidate region feature obtained to step 4 carries out Unsupervised clustering using K mean cluster algorithm, gathers Into K cluster centre.
Step 6, the K that the low-dimensional candidate region feature and step 5 that belong to same image obtained to step 4 is obtained poly- Class center, is polymerized using VLAD algorithms, and every image obtains the VLAD features that a dimension is N*K dimensions.Described VLAD (Vector of Locally Aggregated Descriptors) algorithm is the fusion method based on statistics, and it has counted area Characteristic of field represents final characteristics of image with the accumulation residual error of its close cluster centre;Relative to simple pond algorithm, should The feature that algorithm has more careful description, generation to picture material has more high robust to image conversion.
Step 7, the VLAD features obtained to step 6 carry out dimensionality reduction using Principal Component Analysis Algorithm, and its dimension is reduced to into D Dimension, generates succinct characteristics of image.Dimensionality reduction can reduce Similarity Measure complexity and noise, the similarity wherein between image by Euclidean distance between characteristics of image is measuring.
Beneficial effects of the present invention are that the characteristics of image for generating has the height converted to Image geometry transform and space layout Robustness, drastically increases the accuracy rate of image retrieval, and the characteristics of image that next is obtained is extremely compact succinct, reduces image Between Similarity Measure complexity.
Description of the drawings
Fig. 1 is the flow chart of depth characteristic polymerization of the present invention.
Fig. 2 is the schematic diagram of image searching result, and most left figure is query image, and remaining image is the similar diagram for retrieving Picture, from left to right sorts from high to low according to similarity successively.
Specific embodiment
The specific embodiment of the present invention is described in detail below in conjunction with technical scheme and accompanying drawing.
Embodiment 1:The retrieval of similar image
1. Fig. 1 is the flow chart of the present invention, uses Selective Search algorithms to all images of storehouse image first Quick mode carry out the extraction of candidate region, average every image can obtain the candidate region that about 2000 sizes differ.
2. the present invention is input into as 224*224's using the convolutional neural networks structure Alex network of Krizhevsky et al. RGB image, including five layers of convolutional layer, three layers of maximum pond layer and three layers of full articulamentum.The network is trained using Caffe frameworks, Training data is 1000 class categorized data sets in ILSVRC12 matches.
3. after the completion of network training, the candidate region that step 1 is obtained is by filling and zooms to fixed size 224*224 Afterwards as the input of network, the feature of the output as correspondence candidate region of full articulamentum fc7 is extracted, its size is 4096 dimensions.
4. dimensionality reduction is carried out to the feature of all candidate regions using Principal Component Analysis Algorithm, obtain low-dimensional candidate region special Levy, wherein corresponding dictionary dimension size be 512*4096, will all candidate regions characteristic dimension from 4096 dimension drop to 512 Dimension.
5. Unsupervised clustering is carried out to low-dimensional candidate region feature using K mean cluster algorithm, be polymerized to 256 cluster centres {c1, c2..., c256}。
6. use VLAD algorithms to be VLAD features by the low-dimensional candidate region feature coding of each image.First, distribute Each low-dimensional candidate region feature p in imagejTo from its 5 nearest cluster centre rNN (pj), be then polymerized all low-dimensionals Candidate region feature deducts the residual error of the cluster centre of its distribution, obtains x as the VLAD features of image:
Wherein, j is the subscript of candidate region in an image;pjThe low-dimensional feature of the candidate region of j is designated as under;c1、ck Respectively first and k-th cluster centre;rNN(pj) it is from pj5 nearest cluster centres;wj1、wjkFor pjRespectively with c1With ckGaussian kernel similarity, represent correspondence cluster centre weight, to each candidate region, standardize it to nearest 5 gather The weight at class center and for 1.Final every image obtains corresponding VLAD features, and its size is 512*256=131072 dimensions.
7. the VLAD features for being obtained to step 6 using Principal Component Analysis Algorithm carry out dimensionality reduction, obtain succinct characteristics of image, Wherein corresponding dictionary dimension be 512*131072, will VLAD features characteristic dimension from 131072 dimension drop to 512 dimensions.
8. for query image, candidate region is generated using step 1, step 3 extracts candidate region feature, then using Principal Component Analysis Algorithm dictionary and cluster centre that Jing is completed in step 4,5 training, obtain its corresponding VLAD special by step 6 Levy, the Principal Component Analysis Algorithm dictionary dimensionality reduction for finally being completed using step 7 training obtains the succinct characteristics of image of 512 dimensions.
9. the Euclidean distance between the feature of query image and the characteristics of image of storehouse image is calculated, and is sorted by size, distance Similarity is higher between value less expression image.Fig. 2 is the schematic diagram of the result of retrieval.

Claims (1)

1. a kind of object level depth characteristic polymerization towards image retrieval, it is characterised in that following steps:
Step 1, to database in each image using Selective Search algorithms extract candidate region, generate image Candidate region;
Step 2, selects convolutional neural networks structural model, and carries out pre-training to convolutional neural networks on public database;
Step 3, the convolutional neural networks completed using training extract the feature in all image candidate regions
3.1) image candidate area zoom is filled into after fixed size, as the input of convolutional neural networks;
3.2) using the output of the full articulamentum FC7 of convolutional neural networks as the image candidate region Expressive Features;
Step 4, the Expressive Features of the candidate region obtained to step 3 carry out dimensionality reduction using Principal Component Analysis Algorithm, by its dimension N-dimensional is reduced to, low-dimensional candidate region feature is obtained;
Step 5, the low-dimensional candidate region feature obtained to step 4 carries out Unsupervised clustering using K mean cluster algorithm, is polymerized to K Individual cluster centre;
Step 6, in the K cluster that the low-dimensional candidate region feature and step 5 that belong to same image obtained to step 4 is obtained The heart, is polymerized using VLAD algorithms, and every image obtains the VLAD features that a dimension is N*K dimensions;
Step 7, the VLAD features obtained to step 6 carry out dimensionality reduction using Principal Component Analysis Algorithm, and its dimension is reduced to into D dimensions, raw Into succinct characteristics of image.
CN201611152148.2A 2016-12-14 2016-12-14 Object-level depth feature aggregation method for image retrieval Pending CN106649665A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611152148.2A CN106649665A (en) 2016-12-14 2016-12-14 Object-level depth feature aggregation method for image retrieval

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611152148.2A CN106649665A (en) 2016-12-14 2016-12-14 Object-level depth feature aggregation method for image retrieval

Publications (1)

Publication Number Publication Date
CN106649665A true CN106649665A (en) 2017-05-10

Family

ID=58822514

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611152148.2A Pending CN106649665A (en) 2016-12-14 2016-12-14 Object-level depth feature aggregation method for image retrieval

Country Status (1)

Country Link
CN (1) CN106649665A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239535A (en) * 2017-05-31 2017-10-10 北京小米移动软件有限公司 Similar pictures search method and device
CN108205580A (en) * 2017-09-27 2018-06-26 深圳市商汤科技有限公司 A kind of image search method, device and computer readable storage medium
CN108416290A (en) * 2018-03-06 2018-08-17 中国船舶重工集团公司第七二四研究所 Radar signal feature method based on residual error deep learning
CN108596163A (en) * 2018-07-10 2018-09-28 中国矿业大学(北京) A kind of Coal-rock identification method based on CNN and VLAD
CN108874889A (en) * 2018-05-15 2018-11-23 中国科学院自动化研究所 Objective body search method, system and device based on objective body image
CN109948666A (en) * 2019-03-01 2019-06-28 广州杰赛科技股份有限公司 Image similarity recognition methods, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060193538A1 (en) * 2005-02-28 2006-08-31 Microsoft Corporation Graphical user interface system and process for navigating a set of images
CN102112984A (en) * 2008-07-29 2011-06-29 皇家飞利浦电子股份有限公司 Method and apparatus for generating image collection
CN105930382A (en) * 2016-04-14 2016-09-07 严进龙 Method for searching for 3D model with 2D pictures

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060193538A1 (en) * 2005-02-28 2006-08-31 Microsoft Corporation Graphical user interface system and process for navigating a set of images
CN102112984A (en) * 2008-07-29 2011-06-29 皇家飞利浦电子股份有限公司 Method and apparatus for generating image collection
CN105930382A (en) * 2016-04-14 2016-09-07 严进龙 Method for searching for 3D model with 2D pictures

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HERVÉ JÉGOU ET AL.: "Aggregating local descriptors into a compact image representation", 《2010 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
KONDA REDDY MOPURI ET AL.: "Object Level Deep Feature Pooling for Compact Image Representation", 《2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS》 *
YUNCHAO GONG ET AL.: "Multi-Scale Orderless Pooling of Deep Convolutional Activation Features", 《EUROPEAN CONFERENCE ON COMPUTER VISION 2014》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239535A (en) * 2017-05-31 2017-10-10 北京小米移动软件有限公司 Similar pictures search method and device
CN108205580A (en) * 2017-09-27 2018-06-26 深圳市商汤科技有限公司 A kind of image search method, device and computer readable storage medium
WO2019062534A1 (en) * 2017-09-27 2019-04-04 深圳市商汤科技有限公司 Image retrieval method, apparatus, device and readable storage medium
KR20200011988A (en) * 2017-09-27 2020-02-04 선전 센스타임 테크놀로지 컴퍼니 리미티드 Image retrieval methods, devices, devices, and readable storage media
KR102363811B1 (en) * 2017-09-27 2022-02-16 선전 센스타임 테크놀로지 컴퍼니 리미티드 Image retrieval methods, devices, instruments and readable storage media
US11256737B2 (en) * 2017-09-27 2022-02-22 Shenzhen Sensetime Technology Co., Ltd. Image retrieval methods and apparatuses, devices, and readable storage media
CN108416290A (en) * 2018-03-06 2018-08-17 中国船舶重工集团公司第七二四研究所 Radar signal feature method based on residual error deep learning
CN108874889A (en) * 2018-05-15 2018-11-23 中国科学院自动化研究所 Objective body search method, system and device based on objective body image
CN108596163A (en) * 2018-07-10 2018-09-28 中国矿业大学(北京) A kind of Coal-rock identification method based on CNN and VLAD
CN109948666A (en) * 2019-03-01 2019-06-28 广州杰赛科技股份有限公司 Image similarity recognition methods, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
Wang et al. Enhancing sketch-based image retrieval by cnn semantic re-ranking
CN106649665A (en) Object-level depth feature aggregation method for image retrieval
Saavedra et al. Sketch based Image Retrieval using Learned KeyShapes (LKS).
CN108038122B (en) Trademark image retrieval method
CN108280187B (en) Hierarchical image retrieval method based on depth features of convolutional neural network
CN104036012B (en) Dictionary learning, vision bag of words feature extracting method and searching system
CN106126581A (en) Cartographical sketching image search method based on degree of depth study
Sarkhel et al. Deterministic routing between layout abstractions for multi-scale classification of visually rich documents
CN110175249A (en) A kind of search method and system of similar pictures
CN107908646A (en) A kind of image search method based on layering convolutional neural networks
CN109086405A (en) Remote sensing image retrieval method and system based on conspicuousness and convolutional neural networks
CN105760875B (en) The similar implementation method of differentiation binary picture feature based on random forests algorithm
Sasithradevi et al. Video classification and retrieval through spatio-temporal Radon features
CN112036511B (en) Image retrieval method based on attention mechanism graph convolution neural network
Singh et al. Image corpus representative summarization
CN113920516A (en) Calligraphy character skeleton matching method and system based on twin neural network
Lin et al. Scene recognition using multiple representation network
Qi et al. Object-based image retrieval with kernel on adjacency matrix and local combined features
Zhao et al. Content-based image retrieval using optimal feature combination and relevance feedback
Song et al. Srrm: Semantic region relation model for indoor scene recognition
Ding et al. An efficient 3D model retrieval method based on convolutional neural network
CN105975643B (en) A kind of realtime graphic search method based on text index
Farhangi et al. Informative visual words construction to improve bag of words image representation
Ahmad et al. SSH: Salient structures histogram for content based image retrieval
Xu et al. Sketch-based shape retrieval via multi-view attention and generalized similarity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170510