WO2015032670A1 - Method of classification of images and corresponding device - Google Patents

Method of classification of images and corresponding device Download PDF

Info

Publication number
WO2015032670A1
WO2015032670A1 PCT/EP2014/068166 EP2014068166W WO2015032670A1 WO 2015032670 A1 WO2015032670 A1 WO 2015032670A1 EP 2014068166 W EP2014068166 W EP 2014068166W WO 2015032670 A1 WO2015032670 A1 WO 2015032670A1
Authority
WO
WIPO (PCT)
Prior art keywords
images
positive
attribute
image feature
visual
Prior art date
Application number
PCT/EP2014/068166
Other languages
French (fr)
Inventor
Praveen Anil KULKARNI
Gaurav Sharma
Joaquin Zepeda
Louis Chevallier
Original Assignee
Thomson Licensing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing filed Critical Thomson Licensing
Publication of WO2015032670A1 publication Critical patent/WO2015032670A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Definitions

  • the present disclosure relates to the field of classification of images.
  • Image classification aims to determine whether a given image contains a specific visual concept (e.g. car, cow, sea, forest).
  • images are represented by extracting from them a feature vector.
  • the feature vector is a high-dimensional vector of numerical features that represent an image.
  • a widely used feature vector is the bag-of- words vector consisting of a histogram of quantized image patches centered at corners detected in the image. Throughout this document, if not explicitly mentioned, it is implicitly assumed that computations that use images in fact use the corresponding feature vectors.
  • image classification a training set is obtained by manual annotation of a set of images.
  • annotated training data set For example, for the concept "cat", a person goes through, e.g., 1000 images of a set of images, and assigns to each image a label of "1 " if a cat is present in the image and a "0" otherwise.
  • the resulting set of images and labels are referred to as an "annotated training data set".
  • This annotated training data set is used to learn a classifier, and this classifier can be used to assign a label to an un- annotated set of images without manual intervention.
  • the classifier can be thought of as an algorithm which classifies the given image and assigns it to a corresponding visual concept. The assignment is based on the visual content present in an image.
  • SVM support vector machines
  • K-nearest neighbors K-nearest neighbors
  • the annotation of the training set is a manual task, and is laborious and expensive.
  • the resulting classifier vectors are fixed and thus inflexible for learning new visual concepts. Yet the un-annotated dataset will likely vary, e.g. when the un-annotated data set is a user collection of images, images are deleted, replaced, and added to the user collection.
  • a classifier for the visual concept specified by the text query can then be learned (corresponding to previous mentioned a step (ii)) and the resulting classifier vector is used as in step (iii) for the classification of an un-annotated data set.
  • two problems can be observed with this prior art on-the-fly image classification:
  • the disclosure comprises a method of classification of images, the method comprising: receiving a query for a visual concept; executing a first image search for the query visual concept in a first set of images and extracting first image feature descriptors from images returned by the first image search that correspond to said queried visual concept, forming a positive set of positive image feature descriptors of images; determining a set of attribute visual concepts that are related to the queried visual concept; executing a second image search in the first set of images for the set of attribute visual concepts and extracting second image feature descriptors from images returned by the second image search, forming an attribute set of attribute image feature descriptors of images; computing a first score for each of the positive image feature descriptors in the positive set by calculating inner products between each of the positive image feature descriptors in the positive set and the attribute image feature descriptors in the attribute set, and removing positive image feature descriptors from the positive set having a first score that is under a determined threshold, forming a pruned positive
  • the method further comprises a step of computing a third score for each of the images in the un-annotated data set by calculating inner products between image feature descriptors of each of the images in the un-annotated set and the image feature descriptors in the attribute set, and combining the third score with the second score obtained in the classifying images step to obtain a fourth score for each of the images in the un-annotated set, and classifying of the images in the un- annotated set according to the fourth score.
  • the determining of said set of attribute visual concepts that are related to said queried visual concept comprises a step of retrieving a list of attributes that are related to the queried visual concept from a relational image set.
  • the determining of said set of attribute visual concepts that are related to said queried visual concept comprises a step of retrieving a list of attributes that are related to the queried visual concept from information comprised in the query.
  • the determining of said set of attribute visual concepts that are related to said queried visual concept comprises a step of retrieving a list of attributes that are related to the queried visual concept through language processing of the query.
  • the determining of said set of attribute visual concepts that are related to said queried visual concept comprises a step of retrieving a list of attributes that are related to the queried visual concept through a word search in textual data, whereby words that have a high frequency of occurrence relative to other words are retained for being comprised in the list of attributes.
  • the present disclosure also comprises a device for classification of images, the device comprising: an interface for receiving a query for a visual concept; a processing module for executing a first image search for the query visual concept in a first set of images and extracting first image feature descriptors from images returned by the first image search that correspond to said queried visual concept, forming a positive set of positive image feature descriptors of images; a processing module for determining a set of attribute visual concepts that are related to the queried visual concept; a processing module for executing a second image search in the first set of images for the set of attribute visual concepts and for extracting second image feature descriptors from images returned by the second image search, forming an attribute set of attribute image feature descriptors of images; a processing module for computing a first score for each of the positive image feature descriptors in the positive set by calculating inner products between each of the positive image feature descriptors in the positive set and the attribute image feature descriptors in the attribute set, and for removing positive image feature descriptors
  • Figure 1 is a traditional prior-art classification system.
  • Figure 2 is an on-the-fly prior-art classification system that addresses some of the problems of the traditional prior-art classification system of figure 1 .
  • Figure 3 is an attribute-based on-the-fly classification system according a particular example embodiment.
  • Figure 4 is a variant embodiment of the on-the-fly classification system of figure 3.
  • Figure 5 is a flow chart of a particular embodiment of the disclosed on-the-fly classification system.
  • Figure 6 is a device implementing a particular embodiment of the disclosed on-the-fly classification system.
  • Figure 1 is a traditional prior-art classification system which has briefly been discussed in the first part of the background section.
  • Element 10 represents a positive data set X+ that is collected in a first step (i) of collecting an annotated training set consisting of positive images representing the visual concept.
  • Element 1 1 is a negative dataset X- of negative images representing the universe of opposite non matching concepts.
  • the annotated data is then used to train a visual classifier for the specified visual concept, using an SVM training algorithm.
  • This produces a vector, or classifier ⁇ W in the image feature space.
  • the resulting classifier ⁇ W is then applied to all un-annotated images of an un-annotated data set (element 14) (e.g. a 'raw' user collection of images that has not been classified), which amounts to computing the inner product between the classifier vector and the image feature vector of the un-annotated image.
  • Figure 2 is a prior-art on-the-fly classification system (Chatfield) that has briefly been discussed in the second part of the background section and that addresses some problems of the traditional prior art of figure 1 .
  • the classification system of figure 2 is referred to as an "on-the-fly" classification system and is entirely machine processed to the contrary of the traditional classification system that comprises a manual annotation step of the training set. This machine processing makes it possible to classify with a delay that is much shorter than is possible with the traditional classification system. Processing is as follows.
  • a visual concept, specified as a text query (element 17) is to be searched for in an un-annotated data set (element 14), e.g. a user collection of unclassified images.
  • the textual query is fed to an image search engine (element 18).
  • the images returned by the image search engine form a positive data set (element 20).
  • Image features are extracted by an image feature vector extractor (element 19) from the images returned by the image search engine and are converted in vector form.
  • the obtained vectors are associated with the images of the positive data set.
  • a fixed pool of very diverse images is used as a negative data set (21 ), With this positive and negative data sets, a classifier for the visual concept specified by the text query is learned using the process that has been described for figure 1 . Disadvantages of this prior-art on-the-fly classification system have already briefly been discussed: (i) among the images retrieved from the image search engine, are many incorrect representatives of the queried visual concept.
  • Figure 3 is an attribute-based on-the-fly classification system according an example embodiment. Attributes are high level visual characteristics that the human brain uses when classifying images in categories. For example a class of animals can also be represented by attributes four-legged, furry, and striped. Attribute identifiers are built based on the same system setup as shown in figure 2, with positive images being representative of the attribute, and negative images that are not representative of the attribute. For example, to train an image classifier for the attribute "furry”, positive images containing furry animals (e.g., cats, bears, ...) and negative images containing animals without fur (e.g., lizards, birds) are used.
  • furry animals e.g., cats, bears, .
  • negative images containing animals without fur e.g., lizards, birds
  • a k is used to denote the attribute classifier thus learned for attribute number k.
  • a database of attribute classifiers is used. This attribute database can be built offline and/or built/extended on-the-fly.
  • the classification system of figure 3 shows the following elements that are additional to the prior-art on- the-fly image classification system of figure 2: attribute engine 34, attribute database 35, positive dataset pruning function 36, and optionally a hybrid attribute/SVM ranking function 33. Elements in figure 3 that have been discussed previously in the context of figure 2 have the same numbers and are not discussed here further.
  • SVM classification other classification methods can be used, such as the K-nearest neighbor classifier.
  • the latter does not produce a vector W but a partition of feature space, akin to Voronoi cells.
  • the positive set with feature vectors of images corresponding to the query visual concept, and the negative with feature vectors of images not corresponding to the query are fed to a vector classifier. This defines a partition of the image feature space that consists of positive and negative regions of the feature space.
  • Attribute engine maps a query visual concept (e.g., "sheep") into a list of related attributes (e.g. "sheep" related attributes are ⁇ furry,hooves ⁇ ). This function can be implemented according to several variant embodiments:
  • relational image data sets like ImageNet. Such relational image data sets are organized into a network of semantic/conceptual relationships, with a representative image set per concept. Given a query visual concept, neighboring concepts serve as attributes, and the corresponding representative images of the neighboring concepts is used to extend the attribute database block as described hereafter.
  • Annotating a dataset like ImageNet has been a crowd sourced task that has begun in the 1980s.
  • the user provides query-related attributes as part of the query specification. This is done for example using a pre-defined syntax to enter the concept and related concept(s): e.g., ⁇ visual concept>: ⁇ attribute 1 , attribute 2 ⁇ like "sheep: ⁇ furry, hooves ⁇ ”) is used or different text boxes are used to enter the query, thereby clarifying which elements of the query correspond to the visual concept and which elements of the query correspond to the query related attributes.
  • natural language processing techniques are used to extract visual concepts and related attributes from the user query text (e.g., nouns represent the visual concept, adjectives in the query represent the query attributes),
  • the text corresponding to the user query is fed to a textual search engine to retrieve attributes that are related to the query. Words that have a relative high frequency of appearance in the retrieved documents are used as attributes.
  • Attribute Database (element 35): The list of attributes (e.g., ⁇ furry, hooves ⁇ ) at the output of the attribute engine 34 is fed to this function.
  • the attribute database 35 contains a large storage of such attribute classifiers. Since attributes can be shared by a wide range of visual concepts, it is conceivable to build a useful, generic set of attributes. For example, these are color attributes ("red”, “blue”, “green”), shape attributes
  • a variant embodiment for creating the attribute database is to extend it while using the on-the-fly system: each attribute in the input list of attributes is fed independently to the on-the-fly system in Figure 3. The resulting attribute classifier is stored in the attribute database.
  • the matrix A T X + is normalized along rows to make different attribute scores comparable across images by removing the mean and dividing by the standard deviation in a row-wise manner, thereby obtaining a normalized matrix A n0 rm:
  • the resulting attribute scores s are sorted in descending order and the last "n" images are discarded.
  • the remaining images form the positive data set 30.
  • the pruned positive data set 30 of image feature vectors is used as an input in the classification process of the un-annotated data set 14, as for example described for figure 2, using a SVM classifier 12, that also receives as an input the image feature vectors of a set of negative images representing visual concepts that do not correspond to the query visual concept, the SVM classifier produces a vector W.
  • This vector is used in a SVM ranker 33 to rank the images in the un-annotated data set 14 according to their relevance to the query visual concept.
  • Figure 4 is a variant embodiment of the on-the-fly classification system of figure 3.
  • the classification system described above comprises previous discussed functions (1 ) to (3), and further comprises an optional additional function :
  • Hybrid attribute / SVM ranking function (element 33) :
  • the matrix A which is the output of the attribute database block, is additionally used to rank the un-annotated dataset X T (element 14).
  • a hybrid approach is thus used that mixes both the SVM classifier score and the attribute classifier score: a: similar to steps (a) to (c) of above step (3) (positive data set pruning), in order to get an attribute score for the un-annotated data Xj. classification of elements in matrix Xj using the attributes in matrix A, giving a matrix ⁇ ⁇ ⁇ ; normalization of this matrix, giving a matrix T n0 rm:
  • Figure 5 is a flow chart of a particular embodiment of the method for classification of images.
  • a query for a visual concept is received.
  • a first image search for the query visual concept in a first set of images is executed, and image feature descriptors are extracted from images returned by the first image search, thereby forming a positive set of positive image feature descriptors of images corresponding to the query visual concept.
  • a set of attribute visual concepts is determined that are related to the query visual concept.
  • a second image search is executed in the first set of images for the attribute visual concepts and image feature descriptors are extracted from images returned by the second image search, thereby forming an attribute set of attribute image feature descriptors of images corresponding to the attribute visual concepts.
  • a first score is computed for each of the image feature descriptors in the positive set by calculating inner products between each of the positive image feature descriptors and the attribute image feature descriptors, and positive image feature descriptors are removed from the positive set having a first score that is under a determined threshold, teherby forming a pruned positive set.
  • a step 55 the pruned positive set and a negative set with feature vectors of images not corresponding to the query visual concept are fed to a vector classifier, thereby partitioning the image feature space in a positive feature space region and in a negative feature space region .
  • images in an un-annotated set of images are classified by computing an inner product between vector W and each of the image feature descriptors of the un- annotated set thereby obtaining a second score for each image in the un- annotated set, and classifying of the images in the un-annotated set according to the second score.
  • Figure 6 is a device 600 implementing a particular embodiment of the described method.
  • the device comprises: an interface (60) for receiving a query for a visual concept; a processing module (62) for executing a first image search for the query visual concept in a first set of images and extracting first image feature descriptors from images returned by the first image search that correspond to the queried visual concept, forming a positive set of positive image feature descriptors of images; a processing module (62) for determining a set of attribute visual concepts that are related to the query visual concept; a processing module (62) for executing a second image search in the first set of images for the set of attribute visual concepts and for extracting second image feature descriptors from images returned by the second image search, forming an attribute set of attribute image feature descriptors of images; a processing module (62) for computing a first score for each of the positive image feature descriptors in the positive set by calculating inner products between each of the positive image feature descriptors in the positive set and the attribute image feature descriptors
  • aspects of the present principles can be embodied as a system, method or computer readable medium. Accordingly, aspects of the present principles can take the form of an entirely hardware embodiment, en entirely software embodiment (including firmware, resident software, micro-code and so forth), or an embodiment combining hardware and software aspects that can all generally be defined to herein as a "circuit", "module” or “system”. Furthermore, aspects of the present principles can take the form of a computer readable storage medium. Any combination of one or more computer readable storage medium(s) can be utilized.
  • a computer readable storage medium can take the form of a computer readable program product embodied in one or more computer readable medium(s) and having computer readable program code embodied thereon that is executable by a computer.
  • a computer readable storage medium as used herein is considered a non-transitory storage medium given the inherent capability to store the information therein as well as the inherent capability to provide retrieval of the information there from.
  • a computer readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A user-specified textual query is fed to an image search engine. The returned images are used as training data to learn a classifier for the class category specified by the textual query. The method employs attribute classifiers to improve the on-the-fly image classification system.

Description

Method of classification of images and corresponding device
1. Field
The present disclosure relates to the field of classification of images.
2. Background
Image classification aims to determine whether a given image contains a specific visual concept (e.g. car, cow, sea, forest). To make the task suitable for a computer, images are represented by extracting from them a feature vector. The feature vector is a high-dimensional vector of numerical features that represent an image. A widely used feature vector is the bag-of- words vector consisting of a histogram of quantized image patches centered at corners detected in the image. Throughout this document, if not explicitly mentioned, it is implicitly assumed that computations that use images in fact use the corresponding feature vectors. In image classification a training set is obtained by manual annotation of a set of images. For example, for the concept "cat", a person goes through, e.g., 1000 images of a set of images, and assigns to each image a label of "1 " if a cat is present in the image and a "0" otherwise. The resulting set of images and labels are referred to as an "annotated training data set". This annotated training data set is used to learn a classifier, and this classifier can be used to assign a label to an un- annotated set of images without manual intervention. The classifier can be thought of as an algorithm which classifies the given image and assigns it to a corresponding visual concept. The assignment is based on the visual content present in an image. Although there are many classifiers, most widely and also simple to use classifiers are support vector machines (SVM) [1 ] and K-nearest neighbors [2]. Given a set of positive training examples and negative training examples, SVM learns the separating hyper plane between the positive and negative examples. An image to classify is then classified based on which side of the hyper plane it is situated. In the K-nearest neighbors classification is based on class membership. Again given the set of training examples (positive examples and negative examples), an image is classified based on majority vote of its K nearest neighbors to training examples. For example if K=10 and the image to classify is closer to 7 positive examples and 3 negative examples. Then the image to classify is assigned the visual concept corresponding to the positive example. A traditional approach to do image classification, according to a specified visual concept (e.g. specified by a user), of an un-annotated image data set (e.g. a user collection of images) consists of 3 steps:
(i) Collecting an annotated training set consisting of positive images (i.e. "a positive data set") representing a specified visual concept and negative images (i.e. a "negative data set") representing the universe of visual concepts that do not match to the specified visual concept.
(ii) Using the collected annotated training set to train a visual classifier for the specified visual concept. The SVM (Support Vector Machine) training algorithm is the most widely used approach in this step. It produces a vector in the image feature space, referred to as classifier vector.
(iii) Applying the resulting classifier vector to the un-annotated image data set, in order to classify the images in this data set. In the case of an SVM classifier vector, this amounts to computing an inner product between the classifier vector and each of the image feature vectors of the images in the set of un-annotated images. The inner product can be used to rank the images in the un- annotated data set. The images with highest rank (with largest- value of inner product) are more likely to belong to the class.
Some of the disadvantages of this traditional approach are:
1 . In order to be able to classify images in the un-annotated data set, off line prepared training sets are needed. Annotated training sets a collected for a specific visual concept. It would thus be interesting to have some information about the visual concepts present in the un-annotated data set in order to be able to prepare off line annotated training sets of positive images that correspond to the visual concepts present in the un-annotated data set. In practice, this is difficult. For example, if the un-annotated data set is a collection of images from a movie, the task is almost impossible as there may be thousands of visual concepts in these images.
2. The annotation of the training set is a manual task, and is laborious and expensive.
3. The resulting classifier vectors are fixed and thus inflexible for learning new visual concepts. Yet the un-annotated dataset will likely vary, e.g. when the un-annotated data set is a user collection of images, images are deleted, replaced, and added to the user collection.
To address these problems of the traditional approach to do image classification, prior art e.g. Chatfield et al. " Visor: Towards On-The-Fly Large- Scale Object Category Retrievaf published on the Internet on November 16, 2012, proposes that a visual concept, specified as a text query, is to be searched for in an un-annotated dataset, which textual query is then fed to Google Image Search Engine (GIS). The images resulting of the query then form a positive data set. A fixed pool of very diverse images is used as a negative data set (this corresponds to previous discussed step (i)). With this positive and negative data sets, a classifier for the visual concept specified by the text query can then be learned (corresponding to previous mentioned a step (ii)) and the resulting classifier vector is used as in step (iii) for the classification of an un-annotated data set. However, two problems can be observed with this prior art on-the-fly image classification:
(i) Due to the universality of the web, many images retrieved from the image search engine are not relevant to the queried visual concept; they are incorrect representatives of the queried visual concept. For example, a search for the visual concept "can" produces images of aluminium cans along with posters of Obama's political slogan "Yes, we can", and logos of an African organization called "Coupe d'Afrique de Nations". As these images are in the positive data set, which is used as a training set, the incorrect training data significantly biases the classifier training process.
(ii) The images retrieved from the image search engine that are relevant to the queried visual concept, as well as the images in the fixed negative image data set, are too heterogeneous. This means that the result of classification will not be as good as when using a manually annotated training set, as in the traditional classification system.
It would therefore be desirable to address some of the above drawbacks and to improve the prior art on-the-fly image classification methods in order to get better classification results. 3. Summary
According to different embodiments there is proposed a method and device for classification of image data that solves at least some of the problems of prior art.
To this end, the disclosure comprises a method of classification of images, the method comprising: receiving a query for a visual concept; executing a first image search for the query visual concept in a first set of images and extracting first image feature descriptors from images returned by the first image search that correspond to said queried visual concept, forming a positive set of positive image feature descriptors of images; determining a set of attribute visual concepts that are related to the queried visual concept; executing a second image search in the first set of images for the set of attribute visual concepts and extracting second image feature descriptors from images returned by the second image search, forming an attribute set of attribute image feature descriptors of images; computing a first score for each of the positive image feature descriptors in the positive set by calculating inner products between each of the positive image feature descriptors in the positive set and the attribute image feature descriptors in the attribute set, and removing positive image feature descriptors from the positive set having a first score that is under a determined threshold, forming a pruned positive set; feeding the pruned positive set and a negative set with feature vectors of images not corresponding to the queried visual concept to a vector classifier, thereby defining a partition of the image feature space in a positive region and in a negative region; classifying images in an un- annotated set of images by computing an inner product between the vector classifier and each of the image feature descriptors of the un-annotated set thereby obtaining a second score for each image in the un-annotated set, and classifying of the images in the un-annotated set according to the second score.
According to a variant embodiment of the method it further comprises a step of computing a third score for each of the images in the un-annotated data set by calculating inner products between image feature descriptors of each of the images in the un-annotated set and the image feature descriptors in the attribute set, and combining the third score with the second score obtained in the classifying images step to obtain a fourth score for each of the images in the un-annotated set, and classifying of the images in the un- annotated set according to the fourth score.
According to a variant embodiment of the method the determining of said set of attribute visual concepts that are related to said queried visual concept comprises a step of retrieving a list of attributes that are related to the queried visual concept from a relational image set.
According to a variant embodiment of the method the determining of said set of attribute visual concepts that are related to said queried visual concept comprises a step of retrieving a list of attributes that are related to the queried visual concept from information comprised in the query.
According to a variant embodiment of the method the determining of said set of attribute visual concepts that are related to said queried visual concept comprises a step of retrieving a list of attributes that are related to the queried visual concept through language processing of the query.
According to a variant embodiment of the method the determining of said set of attribute visual concepts that are related to said queried visual concept comprises a step of retrieving a list of attributes that are related to the queried visual concept through a word search in textual data, whereby words that have a high frequency of occurrence relative to other words are retained for being comprised in the list of attributes.
The present disclosure also comprises a device for classification of images, the device comprising: an interface for receiving a query for a visual concept; a processing module for executing a first image search for the query visual concept in a first set of images and extracting first image feature descriptors from images returned by the first image search that correspond to said queried visual concept, forming a positive set of positive image feature descriptors of images; a processing module for determining a set of attribute visual concepts that are related to the queried visual concept; a processing module for executing a second image search in the first set of images for the set of attribute visual concepts and for extracting second image feature descriptors from images returned by the second image search, forming an attribute set of attribute image feature descriptors of images; a processing module for computing a first score for each of the positive image feature descriptors in the positive set by calculating inner products between each of the positive image feature descriptors in the positive set and the attribute image feature descriptors in the attribute set, and for removing positive image feature descriptors from the positive set having a first score that is under a determined threshold, forming a pruned positive set; a processing module for feeding the pruned positive set and a negative set with feature vectors of images not corresponding to the queried visual concept to a vector classifier, thereby defining a partition of the image feature space in a positive region and in a negative region; a processing module for classifying images in an un-annotated set of images by computing an inner product between the vector classifier and each of the image feature descriptors of the un- annotated set thereby obtaining a second score for each image in the un- annotated set, and classifying of the images in the un-annotated set according to the second score.
The discussed advantages and other advantages not mentioned in this document will become clear upon the reading of the detailed description of the present disclosure.
4. Brief description of drawings
Other characteristics and advantages of the different described example embodiments will appear when reading the following description and the annexed drawings. The embodiments described hereafter are merely provided as examples and are not meant to be restrictive.
The embodiments will be described with reference to the following figures:
Figure 1 is a traditional prior-art classification system.
Figure 2 is an on-the-fly prior-art classification system that addresses some of the problems of the traditional prior-art classification system of figure 1 .
Figure 3 is an attribute-based on-the-fly classification system according a particular example embodiment.
Figure 4 is a variant embodiment of the on-the-fly classification system of figure 3. Figure 5 is a flow chart of a particular embodiment of the disclosed on-the-fly classification system.
Figure 6 is a device implementing a particular embodiment of the disclosed on-the-fly classification system.
5. Description of embodiments
Figure 1 is a traditional prior-art classification system which has briefly been discussed in the first part of the background section.
Element 10 represents a positive data set X+ that is collected in a first step (i) of collecting an annotated training set consisting of positive images representing the visual concept. Element 1 1 is a negative dataset X- of negative images representing the universe of opposite non matching concepts. The annotated data is then used to train a visual classifier for the specified visual concept, using an SVM training algorithm. This produces a vector, or classifier <W in the image feature space. The resulting classifier <W is then applied to all un-annotated images of an un-annotated data set (element 14) (e.g. a 'raw' user collection of images that has not been classified), which amounts to computing the inner product between the classifier vector and the image feature vector of the un-annotated image. The computed inner product is then associated with the un-annotated image thereby resulting in an annotated image. The computing of inner products is repeated for any un-annotated image. Finally the images are ranked according to their inner product (element 13). As mentioned, this prior-art method has the drawbacks as mentioned in the background part.
Figure 2 is a prior-art on-the-fly classification system (Chatfield) that has briefly been discussed in the second part of the background section and that addresses some problems of the traditional prior art of figure 1 . The classification system of figure 2 is referred to as an "on-the-fly" classification system and is entirely machine processed to the contrary of the traditional classification system that comprises a manual annotation step of the training set. This machine processing makes it possible to classify with a delay that is much shorter than is possible with the traditional classification system. Processing is as follows. A visual concept, specified as a text query (element 17) is to be searched for in an un-annotated data set (element 14), e.g. a user collection of unclassified images. The textual query is fed to an image search engine (element 18). The images returned by the image search engine form a positive data set (element 20). Image features (see background for information on feature vectors) are extracted by an image feature vector extractor (element 19) from the images returned by the image search engine and are converted in vector form. The obtained vectors are associated with the images of the positive data set. A fixed pool of very diverse images is used as a negative data set (21 ), With this positive and negative data sets, a classifier for the visual concept specified by the text query is learned using the process that has been described for figure 1 . Disadvantages of this prior-art on-the-fly classification system have already briefly been discussed: (i) among the images retrieved from the image search engine, are many incorrect representatives of the queried visual concept. This results in incorrect training data (the positive annotated data set) and significantly biases the result of the classification process; (ii) the relevant images retrieved from the image search engine as well as in the fixed negative image data set are too heterogeneous. This means that though the prior art on-the-fly classification system of figure 2 is faster and much less laborious than the prior art traditional classification system of figure 1 , the classification of the prior-art on-the-fly classification system is not as good as the classification of the prior art traditional classification system that uses a manually annotated positive data set.
Figure 3 is an attribute-based on-the-fly classification system according an example embodiment. Attributes are high level visual characteristics that the human brain uses when classifying images in categories. For example a class of animals can also be represented by attributes four-legged, furry, and striped. Attribute identifiers are built based on the same system setup as shown in figure 2, with positive images being representative of the attribute, and negative images that are not representative of the attribute. For example, to train an image classifier for the attribute "furry", positive images containing furry animals (e.g., cats, bears, ...) and negative images containing animals without fur (e.g., lizards, birds) are used. In the on-the-fly image classification system described here, ak is used to denote the attribute classifier thus learned for attribute number k. A database of attribute classifiers is used. This attribute database can be built offline and/or built/extended on-the-fly. The classification system of figure 3 shows the following elements that are additional to the prior-art on- the-fly image classification system of figure 2: attribute engine 34, attribute database 35, positive dataset pruning function 36, and optionally a hybrid attribute/SVM ranking function 33. Elements in figure 3 that have been discussed previously in the context of figure 2 have the same numbers and are not discussed here further. As an alternative to SVM classification, other classification methods can be used, such as the K-nearest neighbor classifier. The latter does not produce a vector W but a partition of feature space, akin to Voronoi cells. In general, it can be said that the positive set with feature vectors of images corresponding to the query visual concept, and the negative with feature vectors of images not corresponding to the query are fed to a vector classifier. This defines a partition of the image feature space that consists of positive and negative regions of the feature space.
The function of elements 34, 35, 36 and 33 is as follows: (1 ) Attribute engine (element 34): The attribute engine maps a query visual concept (e.g., "sheep") into a list of related attributes (e.g. "sheep" related attributes are {furry,hooves}). This function can be implemented according to several variant embodiments:
a. Using relational image data sets like ImageNet. Such relational image data sets are organized into a network of semantic/conceptual relationships, with a representative image set per concept. Given a query visual concept, neighboring concepts serve as attributes, and the corresponding representative images of the neighboring concepts is used to extend the attribute database block as described hereafter.
Annotating a dataset like ImageNet has been a crowd sourced task that has begun in the 1980s.
b. The user provides query-related attributes as part of the query specification. This is done for example using a pre-defined syntax to enter the concept and related concept(s): e.g., <visual concept>: {attribute 1 , attribute 2} like "sheep: {furry, hooves}") is used or different text boxes are used to enter the query, thereby clarifying which elements of the query correspond to the visual concept and which elements of the query correspond to the query related attributes. According to a variant embodiment, natural language processing techniques are used to extract visual concepts and related attributes from the user query text (e.g., nouns represent the visual concept, adjectives in the query represent the query attributes),
c. Using web mining. According to this variant embodiment the text corresponding to the user query is fed to a textual search engine to retrieve attributes that are related to the query. Words that have a relative high frequency of appearance in the retrieved documents are used as attributes.
(2) Attribute Database (element 35): The list of attributes (e.g., {furry, hooves}) at the output of the attribute engine 34 is fed to this function. The output of this function is a set of corresponding attribute classifiers in matrix form, where e.g. afurry and ah00ves are attribute classifiers for the visual concepts "furry" and "hooves": A = ju rry , a h oo ves J
The attribute database 35 contains a large storage of such attribute classifiers. Since attributes can be shared by a wide range of visual concepts, it is conceivable to build a useful, generic set of attributes. For example, these are color attributes ("red", "blue", "green"), shape attributes
("round", "straight", "curved"), texture attributes ("rough", "smooth", "wavy").
These can also be application-specific attributes, such as "furry" and
"hooves", for the case of animal databases.
A variant embodiment for creating the attribute database is to extend it while using the on-the-fly system: each attribute in the input list of attributes is fed independently to the on-the-fly system in Figure 3. The resulting attribute classifier is stored in the attribute database.
(3) Positive dataset pruner (element 36): The matrix A, which is the output of the attribute database 35, is used to prune (=remove) irrelevant images retrieved from the image search engine (18). "Irrelevant" means here not relevant to a query visual concept. Mathematical steps for realizing this function are for example: a. All positive images retrieved from the image search engine are classified (their inner product is computed) in a matrix X+ using all attributes in matrix A, giving a matrix ATX+: a furry X+l ' •a furry X+N
T T
_ahooves X+l ahooves X+N .
b. The matrix ATX+ is normalized along rows to make different attribute scores comparable across images by removing the mean and dividing by the standard deviation in a row-wise manner, thereby obtaining a normalized matrix An0rm:
furry +1 • CI furry 'v
Figure imgf000013_0001
Normalize each row a by removing the h,ooves x + .CI hooves 'v
row's mean and row's standard deviation.
c. The elements of each column of the normalized matrix Anorm are added to obtain a score s per image:
S— 1 Anorm
The resulting attribute scores s are sorted in descending order and the last "n" images are discarded. The remaining images form the positive data set 30.
Then, the pruned positive data set 30 of image feature vectors is used as an input in the classification process of the un-annotated data set 14, as for example described for figure 2, using a SVM classifier 12, that also receives as an input the image feature vectors of a set of negative images representing visual concepts that do not correspond to the query visual concept, the SVM classifier produces a vector W. This vector is used in a SVM ranker 33 to rank the images in the un-annotated data set 14 according to their relevance to the query visual concept.
Figure 4 is a variant embodiment of the on-the-fly classification system of figure 3. The classification system described above comprises previous discussed functions (1 ) to (3), and further comprises an optional additional function :
(4) Hybrid attribute / SVM ranking function (element 33) : The matrix A, which is the output of the attribute database block, is additionally used to rank the un-annotated dataset XT (element 14). A hybrid approach is thus used that mixes both the SVM classifier score and the attribute classifier score: a: similar to steps (a) to (c) of above step (3) (positive data set pruning), in order to get an attribute score for the un-annotated data Xj. classification of elements in matrix Xj using the attributes in matrix A, giving a matrix ΑτΧτ ; normalization of this matrix, giving a matrix Tn0rm:
Figure imgf000014_0001
Adding the elements of each column of the normalized matrix Tn0rm to obtain a score st per un-annotated image:
Sf = 1 Tnorm
b: Finally the hybrid score used to carry out a hybrid ranking is given by:
score = alpha * norm( wT XT) + (1 -alpha) norm(St)
Figure 5 is a flow chart of a particular embodiment of the method for classification of images. In a step 50, a query for a visual concept is received. In a step 51 , a first image search for the query visual concept in a first set of images is executed, and image feature descriptors are extracted from images returned by the first image search, thereby forming a positive set of positive image feature descriptors of images corresponding to the query visual concept. In a step 52, a set of attribute visual concepts is determined that are related to the query visual concept. In a step 53, a second image search is executed in the first set of images for the attribute visual concepts and image feature descriptors are extracted from images returned by the second image search, thereby forming an attribute set of attribute image feature descriptors of images corresponding to the attribute visual concepts. In a step 54, a first score is computed for each of the image feature descriptors in the positive set by calculating inner products between each of the positive image feature descriptors and the attribute image feature descriptors, and positive image feature descriptors are removed from the positive set having a first score that is under a determined threshold, teherby forming a pruned positive set. In a step 55, the pruned positive set and a negative set with feature vectors of images not corresponding to the query visual concept are fed to a vector classifier, thereby partitioning the image feature space in a positive feature space region and in a negative feature space region . In a step 56, images in an un-annotated set of images are classified by computing an inner product between vector W and each of the image feature descriptors of the un- annotated set thereby obtaining a second score for each image in the un- annotated set, and classifying of the images in the un-annotated set according to the second score.
Figure 6 is a device 600 implementing a particular embodiment of the described method. The device comprises: an interface (60) for receiving a query for a visual concept; a processing module (62) for executing a first image search for the query visual concept in a first set of images and extracting first image feature descriptors from images returned by the first image search that correspond to the queried visual concept, forming a positive set of positive image feature descriptors of images; a processing module (62) for determining a set of attribute visual concepts that are related to the query visual concept; a processing module (62) for executing a second image search in the first set of images for the set of attribute visual concepts and for extracting second image feature descriptors from images returned by the second image search, forming an attribute set of attribute image feature descriptors of images; a processing module (62) for computing a first score for each of the positive image feature descriptors in the positive set by calculating inner products between each of the positive image feature descriptors in the positive set and the attribute image feature descriptors in the attribute set, and for removing positive image feature descriptors from the positive set having a first score that is under a determined threshold, forming a pruned positive set; a processing module (62) for feeding the pruned positive set and a negative set with feature vectors of images not corresponding to the query visual concept to a vector classifier, thereby defining a partition of the image feature space in a positive region and in a negative region; a processing module (62) for classifying images in an un- annotated set of images by computing an inner product between the vector classifier and each of the image feature descriptors of the un-annotated set thereby obtaining a second score for each image in the un-annotated set, and classifying of the images in the un-annotated set according to the second score. The device further comprises a memory 61 for storage of variables, such as vectors and image sets. Processing module 62, interface 60 and memory 61 are interconnected through a data communication bus 63. The device 600 is connected to the outside through a network connection 64.
As will be appreciated by one skilled in the art, aspects of the present principles can be embodied as a system, method or computer readable medium. Accordingly, aspects of the present principles can take the form of an entirely hardware embodiment, en entirely software embodiment (including firmware, resident software, micro-code and so forth), or an embodiment combining hardware and software aspects that can all generally be defined to herein as a "circuit", "module" or "system". Furthermore, aspects of the present principles can take the form of a computer readable storage medium. Any combination of one or more computer readable storage medium(s) can be utilized.
Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative system components and/or circuitry embodying the principles of the present disclosure. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable storage media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown. A computer readable storage medium can take the form of a computer readable program product embodied in one or more computer readable medium(s) and having computer readable program code embodied thereon that is executable by a computer. A computer readable storage medium as used herein is considered a non-transitory storage medium given the inherent capability to store the information therein as well as the inherent capability to provide retrieval of the information there from. A computer readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. It is to be appreciated that the following, while providing more specific examples of computer readable storage mediums to which the present principles can be applied, is merely an illustrative and not exhaustive listing as is readily appreciated by one of ordinary skill in the art: a portable computer diskette; a hard disk; a read-only memory (ROM); an erasable programmable read-only memory (EPROM or Flash memory); a portable compact disc read-only memory (CD-ROM); an optical storage device; a magnetic storage device; or any suitable combination of the foregoing.

Claims

1 . A method of classification of images, the method being characterized in that it comprises:
receiving (50) a query for a visual concept;
executing (51 ) a first image search for said query visual concept in a first set of images and extracting first image feature descriptors from images returned by said first image search that correspond to said queried visual concept, forming a positive set of positive image feature descriptors of images;
determining (52) a set of attribute visual concepts that are related to said queried visual concept;
executing (53) a second image search in said first set of images for said set of attribute visual concepts and extracting second image feature descriptors from images returned by said second image search, forming an attribute set of attribute image feature descriptors of images;
computing (54) a first score for each of the positive image feature descriptors in the positive set by calculating inner products between each of the positive image feature descriptors in the positive set and the attribute image feature descriptors in the attribute set, and removing positive image feature descriptors from the positive set having a first score that is under a determined threshold, forming a pruned positive set;
feeding (55) said pruned positive set and a negative set with feature vectors of images not corresponding to the queried visual concept to a vector classifier, thereby defining a partition of the image feature space in a positive region and in a negative region;
classifying (56) images in an un-annotated set of images by computing an inner product between the vector classifier and each of the image feature descriptors of the un-annotated set thereby obtaining a second score for each image in the un-annotated set, and classifying of the images in the un- annotated set according to said second score.
2. The method according to claim 1 , wherein the method further comprises a step of computing a third score for each of the images in the un- annotated data set by calculating inner products between image feature descriptors of each of the images in the un-annotated set and the image feature descriptors in the attribute set, and combining the third score with the second score obtained in the classifying images step to obtain a fourth score for each of the images in the un-annotated set, and classifying of the images in the un-annotated set according to said fourth score.
3. The method according to claim 1 , wherein said determining of said set of attribute visual concepts that are related to said queried visual concept comprises a step of retrieving a list of attributes that are related to the queried visual concept from a relational image set.
4. The method according to claim 1 , wherein said determining of said set of attribute visual concepts that are related to said queried visual concept comprises a step of retrieving a list of attributes that are related to the queried visual concept from information comprised in the query.
5. The method according to claim 1 , wherein said determining of said set of attribute visual concepts that are related to said queried visual concept comprises a step of retrieving a list of attributes that are related to the queried visual concept through language processing of the query.
6. The method according to claim 1 , wherein said determining of said set of attribute visual concepts that are related to said queried visual concept comprises a step of retrieving a list of attributes that are related to the queried visual concept through a word search in textual data, whereby words that have a high frequency of occurrence relative to other words are retained for being comprised in said list of attributes.
7. A device (600) for classification of images, the device being characterized in that it comprises:
an interface (60) for receiving a query for a visual concept;
a processing module (62) for executing a first image search for said query visual concept in a first set of images and extracting first image feature descriptors from images returned by said first image search that correspond to said queried visual concept, forming a positive set of positive image feature descriptors of images;
a processing module for determining a set of attribute visual concepts that are related to said queried visual concept; a processing module for executing a second image search in said first set of images for said set of attribute visual concepts and for extracting second image feature descriptors from images returned by said second image search, forming an attribute set of attribute image feature descriptors of images;
a processing module for computing a first score for each of the positive image feature descriptors in the positive set by calculating inner products between each of the positive image feature descriptors in the positive set and the attribute image feature descriptors in the attribute set, and for removing positive image feature descriptors from the positive set having a first score that is under a determined threshold; forming a pruned positive set;
a processing module for feeding said pruned positive set and a negative set with feature vectors of images not corresponding to the queried visual concept to a vector classifier, thereby defining a partition of the image feature space in a positive region and in a negative region;
a processing module for classifying images in an un-annotated set of images by computing an inner product between the vector classifier and each of the image feature descriptors of the un-annotated set thereby obtaining a second score for each image in the un-annotated set, and classifying of the images in the un-annotated set according to said second score.
PCT/EP2014/068166 2013-09-06 2014-08-27 Method of classification of images and corresponding device WO2015032670A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP13306226.5 2013-09-06
EP13306226 2013-09-06
EP14305745 2014-05-20
EP14305745.3 2014-05-20

Publications (1)

Publication Number Publication Date
WO2015032670A1 true WO2015032670A1 (en) 2015-03-12

Family

ID=51399668

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2014/068166 WO2015032670A1 (en) 2013-09-06 2014-08-27 Method of classification of images and corresponding device

Country Status (1)

Country Link
WO (1) WO2015032670A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110489594A (en) * 2018-05-14 2019-11-22 北京松果电子有限公司 Image vision mask method, device, storage medium and equipment
US10922327B2 (en) 2013-09-20 2021-02-16 Ebay Inc. Search guidance
CN113222018A (en) * 2021-05-13 2021-08-06 郑州大学 Image classification method
US11222064B2 (en) * 2015-12-31 2022-01-11 Ebay Inc. Generating structured queries from images

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
ANONYMOUS: "Cosine similarity - Wikipedia, the free encyclopedia", 10 August 2013 (2013-08-10), XP055159814, Retrieved from the Internet <URL:http://en.wikipedia.org/w/index.php?title=Cosine_similarity&oldid=567987930> [retrieved on 20141222] *
BISWAS ARIJIT ET AL: "Simultaneous Active Learning of Classifiers & Attributes via Relative Feedback", IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION. PROCEEDINGS, IEEE COMPUTER SOCIETY, US, 23 June 2013 (2013-06-23), pages 644 - 651, XP032492884, ISSN: 1063-6919, [retrieved on 20131002], DOI: 10.1109/CVPR.2013.89 *
CLAUDE SAMMUT; GEOFFREY I. WEBB: "Encyclopedia of Machine Learning", 1 January 2011, SPRINGER SCIENCE AND BUSINESS MEDIA, New York, US, ISBN: 978-0-387-30768-8, article CHRIS DRUMMOND; JANEZ BRANK; DUNJA MLADENIC; MARCO GROBELNIK; HUAN LIU, pages: 52-54, 397-410, XP002734003 *
DHRUV MAHAJAN ET AL: "A joint learning framework for attribute models and object descriptions", COMPUTER VISION (ICCV), 2011 IEEE INTERNATIONAL CONFERENCE ON, IEEE, 6 November 2011 (2011-11-06), pages 1227 - 1234, XP032093762, ISBN: 978-1-4577-1101-5, DOI: 10.1109/ICCV.2011.6126373 *
KEN CHATFIELD ET AL: "VISOR: Towards On-the-Fly Large-Scale Object Category Retrieval", 5 November 2012, COMPUTER VISION ACCV 2012, SPRINGER BERLIN HEIDELBERG, BERLIN, HEIDELBERG, PAGE(S) 432 - 446, ISBN: 978-3-642-37443-2, XP047027140 *
KOVASHKA A ET AL: "WhittleSearch: Image search with relative attribute feedback", COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012 IEEE CONFERENCE ON, IEEE, 16 June 2012 (2012-06-16), pages 2973 - 2980, XP032232425, ISBN: 978-1-4673-1226-4, DOI: 10.1109/CVPR.2012.6248026 *
YU SU ET AL: "Improving Image Classification Using Semantic Attributes", INTERNATIONAL JOURNAL OF COMPUTER VISION, KLUWER ACADEMIC PUBLISHERS, BO, vol. 100, no. 1, 8 May 2012 (2012-05-08), pages 59 - 77, XP035075183, ISSN: 1573-1405, DOI: 10.1007/S11263-012-0529-4 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10922327B2 (en) 2013-09-20 2021-02-16 Ebay Inc. Search guidance
US11640408B2 (en) 2013-09-20 2023-05-02 Ebay Inc. Search guidance
US11222064B2 (en) * 2015-12-31 2022-01-11 Ebay Inc. Generating structured queries from images
CN110489594A (en) * 2018-05-14 2019-11-22 北京松果电子有限公司 Image vision mask method, device, storage medium and equipment
CN113222018A (en) * 2021-05-13 2021-08-06 郑州大学 Image classification method
CN113222018B (en) * 2021-05-13 2022-06-28 郑州大学 Image classification method

Similar Documents

Publication Publication Date Title
Guillaumin et al. Multimodal semi-supervised learning for image classification
EP3248143B1 (en) Reducing computational resources utilized for training an image-based classifier
US11042586B2 (en) Clustering search results based on image composition
Liu et al. Label to region by bi-layer sparsity priors
US10482146B2 (en) Systems and methods for automatic customization of content filtering
WO2017097231A1 (en) Topic processing method and device
US11803971B2 (en) Generating improved panoptic segmented digital images based on panoptic segmentation neural networks that utilize exemplar unknown object classes
Mei et al. Coherent image annotation by learning semantic distance
US10007864B1 (en) Image processing system and method
Lee et al. MAP-based image tag recommendation using a visual folksonomy
WO2015032670A1 (en) Method of classification of images and corresponding device
KR101472451B1 (en) System and Method for Managing Digital Contents
Li et al. Technique of image retrieval based on multi-label image annotation
Abdel-Nabi et al. Content based image retrieval approach using deep learning
US20150254280A1 (en) Hybrid Indexing with Grouplets
Chen et al. An annotation rule extraction algorithm for image retrieval
Kumar et al. Fusion of CNN-QCSO for Content Based Image Retrieval
CN112784893B (en) Image data clustering method and device, electronic equipment and storage medium
Dorado et al. Semantic labeling of images combining color, texture and keywords
Manjula et al. Visual and tag-based social image search based on hypergraph ranking method
Haldurai et al. Parallel indexing on color and texture feature extraction using R-tree for content based image retrieval
Zhong et al. Region level annotation by fuzzy based contextual cueing label propagation
Sabitha et al. Hybrid approach for image search reranking
CN105701150A (en) Intuitionistic fuzzy similarity degree based image retrieving method and system
Yanai et al. Region-based automatic web image selection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14755693

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14755693

Country of ref document: EP

Kind code of ref document: A1