AU2011254041A1 - Compression of sift vectors for image matching - Google Patents

Compression of sift vectors for image matching Download PDF

Info

Publication number
AU2011254041A1
AU2011254041A1 AU2011254041A AU2011254041A AU2011254041A1 AU 2011254041 A1 AU2011254041 A1 AU 2011254041A1 AU 2011254041 A AU2011254041 A AU 2011254041A AU 2011254041 A AU2011254041 A AU 2011254041A AU 2011254041 A1 AU2011254041 A1 AU 2011254041A1
Authority
AU
Australia
Prior art keywords
feature vector
determining
quantised
numerical identifier
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2011254041A
Inventor
Barry James Drake
Alan Valev Tonisson
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Priority to AU2011254041A priority Critical patent/AU2011254041A1/en
Publication of AU2011254041A1 publication Critical patent/AU2011254041A1/en
Abandoned legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

Abstract COMPRESSION OF SIFT VECTORS FOR IMAGE MATCHING Disclosed is a method (1500) of determining a coordinate of a quantised feature vector (439) representing a portion (105) of an image, where the quantised feature vector (439) is represented by a numerical identifier (1410), the method comprising the steps of (a) determining (1310) the height of a simplex having a volume equal to the numerical identifier (1410); (b) determining (1320) a bound for a parameter defining a corresponding binomial coefficient, said bound being dependent upon the height of said simplex; (c) determining (1330), constrained by the determined bound, the largest value of the parameter such that the value of the corresponding binomial coefficient is the largest value less than or equal to the numerical identifier; and (d) determining the coordinate of the quantised feature vector based on the inverse of said largest binomial coefficient, said quantised feature vector representing the portion of the image. P0051 10 specijlodged / 5830629_1 141010 Start Determine feature vector 110 Feature vector 1(see 400 in Fig. 4) Map into a subspace 120 using a radial projection See Fig. 5 Mapped vector 1 (see 437 in Fig. 4) Quantize point 130 _________________________________ See Figs. 6, 7 in the subspace and 8 113 Quantised feature vector (see 439 in Fig. 4) See Figs. 9, Calculate label 10 and11 Compressed feature 114 vector in form of a label (see 441 in Fig. 4) End Fig. 1 P0051 10 _speci lodged /5828504_1 121211

Description

S&F Ref: P005110 AUSTRALIA PATENTS ACT 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT Name and Address Canon Kabushiki Kaisha, of 30-2, Shimomaruko 3 of Applicant: chome, Ohta-ku, Tokyo, 146, Japan Actual Inventor(s): Barry James Drake Alan Valev Tonisson Address for Service: Spruson & Ferguson St Martins Tower Level 35 31 Market Street Sydney NSW 2000 (CCN 3710000177) Invention Title: Compression of sift vectors for image matching The following statement is a full description of this invention, including the best method of performing it known to me/us: 5845c(5834237_1)
-I
COMPRESSION OF SIF VECTORS FOR IMAGE MATCHING TECHNICAL FIELD The current invention relates to methods of reducing storage space for image feature vectors while maintaining their ability to accurately match images and in particular to methods of reducing storage space for Scale Invariant Feature Transform (SIFT) feature vectors. BACKGROUND Contemporary information systems can produce extremely large volumes of digital contents. Managing such digital collections requires large physical storage space as well as fast and effective retrieval methods. Text-based retrieval systems typically create huge indexes using every word in every document in the database. A text query can then be compared to the index in order to identify the most relevant documents. Parallel processing has made such text retrieval systems remarkably fast. However, text retrieval has its limits as digital content is often composed of a mixture of images and text. Retrieving images from large databases is a notoriously difficult problem. Image retrieval has been approached from the perspective of searching and matching manually or semi-automatically entered keywords, with limited success. Systems able to filter images based on their content are expected to provide more accurate results. The most ) common content-based search engines typically combine various primitive features such as colour, texture and structure to describe an image. These features are used to define similarity measures to retrieve images that are alike. There are many applications for image retrieval and therefore many possible meanings for this notion of similarity. For instance, when trying to find duplicate photographs in a photo library, two similar images may differ by a small 5 rotation, a difference in perspective or in scene composition, a cropping operation, rescaling, small edition, etc. Similarity, when applied to images of a person's face, image retrieval requires robustness to account for difference and lighting conditions. In recent years, SIFT (Scale-Invariant Feature Transform) features have been applied to content-based image retrieval in order to match pairs of images where the ability to match 0 needs to be robust to image noise, rescaling, rotation and cropping. SIFT is an image processing method that identifies key points in an image and computes one or more feature vectors for each key point. To determine a match score, the SIFT feature vectors of a query P0051 10_specijlodged / 5830629_1 141010 -2 image can be compared to those in a database, for instance using a Euclidean (or other) distance. Match scores may be compared to find images that best match a given query image. An image can have hundreds or thousands of key points. Each key point is associated with a feature vector of 128 numbers (i.e. "coordinates"). If double-precision floating-point numbers are used (8 bytes each), then a SIFT vector requires 1 Kbyte of storage. In large databases, the large amount of data per image leads to large storage requirements and also to slow matching of vectors. Various techniques have been developed to reduce the storage requirements of SIFT vectors while maintaining their ability to accurately match images. One possible approach is to quantize each coordinate of SIFT vectors to a discrete set of values. This style of quantization is known as scalar quantization. The disadvantage of scalar quantization methods is that they lead to high quantization errors for a given bit budget. SUMMARY It is an object of the present invention to substantially overcome, or at least ameliorate, one or more disadvantages of existing arrangements. Disclosed are arrangements, referred to as Delabelling by Bounded Binomial Inversion (DBBI) arrangements, which seek to address the above problems by representing SIFT vectors using numerical identifiers, or labels, and recovering approximations of the SIFT vectors from the labels as described below. ) According to a first aspect of the present invention, there is provided a method of determining a coordinate of a quantised feature vector representing a portion of an image, where the quantised feature vector is represented by a numerical identifier, the method comprising the steps of: (a) determining the height of a simplex having a volume equal to the numerical 5 identifier; (b) determining, dependent upon the height of said simplex, a bound for a parameter defining a corresponding binomial coefficient; (c) determining, constrained by the determined bound, the largest value of the parameter such that the value of the corresponding binomial coefficient is the largest value 0 less than or equal to the numerical identifier; and (d) determining the coordinate of the quantised feature vector based on the inverse of said largest binomial coefficient, said quantised feature vector representing the portion of the image. P0051 10_specilodged / 5830629_1 141010 -3 According to another aspect of the present invention, there is provided an apparatus for determining a coordinate of a quantised feature vector representing a portion of an image, where the quantised feature vector is represented by a numerical identifier, the apparatus comprising: (a) means for determining the height of a simplex having a volume equal to the numerical identifier; (b) means for determining, dependent upon the height of said simplex, a bound for a parameter defining a corresponding binomial coefficient; (c) means for determining, constrained by the determined bound, the largest value ) of the parameter such that the value of the corresponding binomial coefficient is the largest value less than or equal to the numerical identifier; and (d) means for determining the coordinate of the quantised feature vector based on the inverse of said largest binomial coefficient, said quantised feature vector representing the portion of the image. 5 According to another aspect of the present invention, there is provided a computer readable storage medium having a computer program recorded therein, the program being executable by a computer apparatus to make the computer perform a method for determining a coordinate of a quantised feature vector representing a portion of an image, where the quantised feature vector is represented by a numerical identifier, the program comprising: (a) computer software code for determining the height of a simplex having a volume equal to the numerical identifier; (b) computer software code for determining, dependent upon the height of said simplex, a bound for a parameter defining a corresponding binomial coefficient; (c) computer software code for determining, constrained by the determined bound, !5 the largest value of the parameter such that the value of the corresponding binomial coefficient is the largest value less than or equal to the numerical identifier; and (d) computer software code for determining the coordinate of the quantised feature vector based on the inverse of said largest binomial coefficient, said quantised feature vector representing the portion of the image. 30 According to another aspect of the present invention, there is provided an apparatus for determining a coordinate of a quantised feature vector representing a portion of an image, where the quantised feature vector is represented by a numerical identifier, the apparatus comprising: P0051 10_specilodged / 5830629_1 141010 -4 a processor; and a memory in which is stored a program configured to direct the processor to perform a method comprising the steps of: (a) determining the height of a simplex having a volume equal to the numerical identifier; (b) determining, dependent upon the height of said simplex, a bound for a parameter defining a corresponding binomial coefficient; (c) determining, constrained by the determined bound, the largest value of the parameter such that the value of the corresponding binomial coefficient is the largest value less than or equal to the numerical identifier; and (d) determining the coordinate of the quantised feature vector based on the inverse of said largest binomial coefficient, said quantised feature vector representing the portion of the image. Other aspects of the invention are also disclosed. BRIEF DESCRIPTION OF THE DRAWINGS One or more embodiments of the invention will now be described with reference to the following drawings and Appendices, in which: Fig. 1 is a flow chart illustrating a method of compressing feature vectors according to one DBBI arrangement; ) Fig. 2 is a functional block diagram illustrating a database creation system according to a DBBI arrangement; Fig. 3 is a functional block diagram illustrating an image query system according to a DBBI arrangement; Fig. 4 is a pictorial representation of a method of compressing feature vectors according to 5 a DBBI arrangement; Fig. 5 depicts a radial projection according to a DBBI arrangement; Fig. 6 depicts feature vector quantization according to a DBBI arrangement; Fig. 7 depicts truncated Voronoi cells according to a DBBI arrangement; Fig. 8 depicts an improved mapping function according toa DBBI arrangement; 0 Fig. 9 depicts a method of labelling feature vectors according to a DBBI arrangement; Fig. 10 illustrates how quantized feature vectors are converted to unique labels according to a DBBI arrangement; P0051 10_specilodged / 5830629_1 141010 -5 Fig. 11 shows an example of an alternative method for uniquely labelling quantized feature vectors according to a DBBI arrangement; Fig. 12A shows how to match a query image against images stored in an image database according to a DBBI arrangement; Fig. 12B shows how to match a query image against images stored in an image database according to another DBBI arrangement; Fig. 12C shows how to match a query image against images stored in an image database according to another DBBI arrangement; Fig. 12D shows how to match a query image against images stored in an image database according to another DBBI arrangement; Fig. 13 is a flowchart depicting how to find the largest binomial coefficient smaller than a given value according to a DBBI arrangement; Fig. 14 shows how to decompress compressed feature vectors according to a DBBI arrangement; Fig. 15 is a flowchart depicting how to determine the coordinates of a quantized vector from its label according to a DBBI arrangement; and Figs. 16A and 16B form a schematic block diagram of a general purpose computer system upon which DBBI arrangements described can be practiced. DETAILED DESCRIPTION INCLUDING BEST MODE ) Where reference is made in any one or more of the accompanying drawings to steps and/or features that have the same reference numerals, those steps and/or features have for the purposes of this description the same function(s) or operation(s), unless the contrary intention appears. It is to be noted that the discussions contained in the "Background" section and the section 5 above relating to prior art arrangements relate to discussions of documents or devices which may form public knowledge through their respective publication and/or use. Such discussions should not be interpreted as a representation by the present inventor(s) or the patent applicant that such documents or devices in any way form part of the common general knowledge in the art. o Image The word 'image' is used to mean a visible impression as obtained by a camera, telescope, microscope, scanner or other imaging device, and may result from photographic capture of a P0051 10_specilodged / 5830629_1 141010 -6 physical scene, document, or artwork. Images may also be artificially generated, examples being an image of a rendered document, a 3D model or an artistically manipulated image. Image querying use cases The DBBI arrangements described herein relate to image-based recognition systems. In such systems, a digital image is used to query a database for similar images. Similar images may have visually perceptible similarities in their content without necessarily being pixel wise identical. For instance, two images may be considered similar even if they differ by some geometric transformation such as rotation, scaling, perspective transform, shear, symmetry or cropping. Other variations that may arise between two similar images include those arising from variations in Point Spread Function, dynamic range, capture noise, and capture a partially overlapping scene. Fast and reliable image query methods are required for various systems such as surveillance systems, medical imaging, office document management and security or digital photography. Image query systems for office use may require quick retrieval of the electronic version of a printed document using a captured image of the printed document obtained from a scanner or a camera. Example image querying system Consider an image querying system where a user needs to obtain the location of an electronic document on a server using a physical or electronic reproduction of the document. ) The document type may be a report, a chart, a spreadsheet, a brochure or a photograph that has been printed or faxed, or may be derived by capturing an image displayed on a monitor. The document may contain a mixture of content such as images, text, charts, graphics or photos, and may comprise one or more pages. Fig. 2 depicts a data-processing architecture 200 according to one DBBI arrangement 5 where a database of feature vectors 299 is created from every document 218 that is sent to a printer 212. When the document 218 is to be printed, a PDL file 210 is sent to a print driver 211. The print driver analyses the PDL file 210 and produces a rasterised image 213 in addition to sending an instruction 219 to the printer 212 to print the document 218. A job information file 214, that is also produced by the print driver, records job metadata associated 0 with the document 218 such as the location of the electronic document 218 on a server (not shown), the date and time the job is created, and the login of the user printing the document 218. The rasterised image 213is analysed and a number of feature vectors 220 are produced by a feature vector extractor 215. The feature vectors220 are then compressed in a feature P0051 10_specilodged /5830629_1 141010 -7 vector reducer 216 before the compressed feature vectors 221 and the job information file 214 are associated and registered into the database 299 by a database loader 217.The compression of feature vectors to reduce storage space, as performed by the Feature Vector Reducer 216, is described hereinafter in more detail in regard to Fig. 4. Fig. 3 illustrates a data-processing architecture according to a DBBI arrangement 300 where a query based on are produced document is processed. In Fig. 3, the system 300 generates a captured query image 310, which is typically a digital image obtained from a physical document 305 that the user wishes to retrieve. The query image 310 may be created in various ways such as scanning, faxing or using a camera to take a photo of the printed ) document 305. The query image 310 is processed by a feature vector extraction stage 320 to extract a first set of feature vectors 321. A second set of feature vectors 322 is obtained from the database 299 (see Fig. 2) by a feature vector selection stage 325. In a feature vector comparison step 340, the set of extracted feature vectors 321 determined by the feature vector extractor 320 is 5 compared to the selected feature vectors 322 obtained from the feature vector selector 325. The comparison may be done based on the basis of Euclidean distances(or other distances) between the feature vectors 321 of the query image 310 and those of the images in the database 299 selected by the selector 325. The comparison returns a match score 350 for each image in the database 350. The system 300 selects images from the database 299 that produce ) the highest match scores and presents the images to the user. Examples offeature vectors SIFT is an image processing method that identifies key points in an image and computes, as depicted in a step 110 in Fig. 1, a feature vector 111 for each identified key point. A SIFT vector is the normalized histogram of gradients in a patch, such as 105 in Fig. 1, around a key 5 point in an image. SIFT vectors have the following properties: 1. Their coordinates are all non-negative; and 2. The Euclidean length of a SIFT vector is one. Therefore, the SIFT vectors extracted from an image can be represented on the "orthant" of the surface of a hyper-sphere that does not have negative coordinate values. The term 0 "orthant" refers to a higher-dimensional generalization of (a) a quadrant in two dimensions, and (b) an octant in three dimensions. P0051 10 speci lodged / 5830629_1 141010 -8 in practice, a SIFT vector can be representedby128 double-precision floating-point numbers. In an alternative DBBI arrangement, a SIFT vector can be represented as an array of 128 8-bit fixed point numbers. The DBBI arrangements herein described apply in particular to compression of SIFT feature vectors. However it should be understood that the DBBI arrangements more generally apply to other types of feature vectors such as vectors whose coordinates represent frequencies in a histogram, any type of feature vectors with properties (1) and (2) above, and to feature vectors that may be easily converted to a form satisfying (1) and (2). For example, vectors that lie on a specific orthant of a hyper-sphere have coordinates with predetermined ) signs, and could thus be converted to coordinates that are all non-negative. Noting that condition (2) implies that one of the coordinates is redundant, a SIFT vector may alternatively be represented by a vector of127 non-negative coordinates whose Euclidean length is not greater than one. "Euclidean length" is the same as the Euclidean norm which is the square root of the sum of the squares of the coordinates. Challenges offeature storage Storage of a large number of feature vectors is more efficient if the feature vectors are compressed. However feature vectors must be compressed in such a way that it is still possible, using only the compressed vectors, to determine if any two uncompressed feature vectors are similar. This means that either the compressed vectors must be comparable, or the 3 compressed vectors must be such that they can be made comparable after partial or complete decompression. In this context "comparable" means that we can determine how different or similar the features represented by two vectors are. Euclidean distance is one example of a measure of similarity between two vectors, where a small distance indicates that the vectors represent similar features and a large distance indicates that the vectors represent dissimilar .5 features. Additionally, compressed feature vectors need to be encoded using as few bits as possible to reduce storage overhead. Feature vector compression method Fig. 1 depicts a method of compressing a feature vector. In the step 110, the feature vector 111of dimension N> 1 is obtained for the portion 105of an image being considered. In a 30 following step 120, a radial projection is applied to the feature vector111. This projection effectively maps the feature vector 11 into a Euclidean sub-space of dimension n < N to thereby form a mapped vector 112. The radial projection step 120 will be described hereinafter in more detail with reference to Fig. 5.Next a quantization step 130 quantizes the P0051 10_specijlodged / 5830629_1 141010 -9 mapped vector 112 to one of a finite number of predetermined points within the subspace to thereby form a quantized feature vector 113. Thereafter a label calculation step 140 constructs a compressed feature vector 114 in the form of a label (also referred to as a numerical identifier) from the quantized feature vector 113. The label 114 is typically an index of the predetermined point to which the feature vector has been quantized, the index being determined according to a pre-determined enumeration i.e. the predetermined points are assigned integer index values starting from 0 up to one less than the total number of predetermined points. The label 114 can be a bit-string that encodes the value of the index Using a limited range of sequence numbers limits the number of bits required to encode any index value, therefore limiting the number of bits required to encode a quantised vector. As long as each sequence number in the limited range has equal possibility of being used (to encode a SIFT vector), then no storage is wasted by providing bits to describe sequence numbers that can never be used. More details on the steps of mapping 120, quantization 130 and labelling 140 are provided hereinafter with reference to Figs. 5, 6-8, and 9-11 respectively. In general, the label 114 can take the form of a 5 numerical string, serving as an identifier. Fig. 4 is a pictorial representation of the method of Fig. 1 for compressing, as depicted by a dashed arrow 442, feature vectors of dimension 3. Consider a normalized feature vector 400 with positive coordinates. This vector lies on the positive octant of the surface of a sphere. A gnomonic projection 420 transforms the feature vector 400 to form a mapped feature vector ) 437 that lies in a subspace 425. In this example, the subspace 425 is a plane and the mapped feature vector occupies a triangular region of the subspace. This projection corresponds to the mapping step 120 that is described in relation to Fig. 1. The mapped vector 437 is then quantized 430 to one 439 of a predetermined finite set 435of quantized points on the subspace 425. This quantization corresponds to the quantization step 130 in Fig. 1. Each of the 5 quantized points of the set 435 is assigned 440 a corresponding unique label 441. The label 441 of the quantized point 439 that results from the step 430 represents, in compressed form, the feature vector 400. The aforementioned three-dimensional example can be generalized to a higher dimensional space, in which case the feature vector 400will lie on a hyper-sphere and will be mapped onto 0 a subset of a hyper-plane where the subset is a simplex. Mapping The step 120in Fig. lof mapping a feature vector is now described in more detail in relation to Fig. 5. In one DBBI arrangement a radial projection is used to perform the P0051 10_specilodged / 5830629_1 141010 -10 mapping 120. The radial projection projects a feature vector of dimension N into a subspace of dimension N-1. Fig. 5 illustrates such a radial projection, where a feature vector 510 of dimension 2 lies on the positive quadrant of a circle. A radial projection projects the vector 510 onto a vector 520 that lies on a line segment 530. The line segment 530 is a one-dimensional simplex contained in a one-dimensional subspace. The two-dimensional example of Fig. 5 can be generalized to any higher dimension. In three dimensions, feature vectors occupy an octant of a unit sphere and the line segment is replaced with a triangle. The radial projection in three dimensions is an example of a gnomonic projection known to cartographers. More generally, the radial projection ((v) of a feature vector v of N dimensions can be described by the following equation: h (v)= vi, (1) Where h is an integer representing the relative scale of the quantizing lattice, and livi is the l 1 norm of v, which, in this instance, is equal to the sum of the coordinates of v because the coordinates of the feature vector are constrained to be non-negative. The sum of the coordinates of each mapped vector O(v)will be h, and the mapped points will lie inside an n-dimensional simplex, where n is one less than N, where Nis the number of coordinates of the feature vector v. Note that since the sum of the coordinates of each mapped vector is a predetermined value it is possible to determine one coordinate from the sum of the ) remaining values. As a result, one coordinate of the vector D(v) may be considered to be redundant. In alternative DBBI arrangements, different types of functions may be used in the step 120 to map the feature vectors, including linear projections, perspective transforms and radial projection with a centre that is not at the origin. 5 In another DBBI arrangement, the mapping operation 120may be combined with a dimensionality reduction operator, such as an operator based on Principal Component Analysis (PCA). In this case, the dimension of the resulting subspace can be less than N-1. In yet another DBBI arrangement, the mapping function 120 can be modified in order to improve the efficiency of the quantization step 130. An example of such modification will be 0 described in more detail hereinafter with respect to Fig. 8. P0051 10_specilodged / 5830629_1 141010 -11 Quantization In one DBBI arrangement, the predetermined points in the set 435 are regularly spaced and correspond to points of a lattice. Examples of such lattices include A.,, An , Dn ,D, where the subscript n refers to the number of dimensions of the subspace spanned by the lattice points, which is equal to the number of dimensions of the subspace 425.However, the present DBBI arrangement compresses SIFT vectors using a set of predetermined points defined by the A 1 2 lattice forming a pyramid with points represented using 128 coordinates. An advantage of using a lattice to define the set 435 of predetermined points is that storage does not have be allocated to represent unused points, so the number of predetermined points may be much larger than the number of points used because the coordinates of the predetermined point closest to a given mapped vector may be calculated, therefore, coordinates of unused points do not have to be stored in the computer's memory or loaded from disk.-Efficient methods for calculating the nearest lattice point to a given vector are known for the An family as well as other lattices. Fig. 6 depicts the step 130 of quantizing feature vectors, and show a predetermined, finite set 611 of points such as a point 610, are chosen in the subspace described above in the step 120. Each of these points 610 is at the centre of, and defines, a corresponding Voronoi cell 620. The Voronoi cell 620 has a property that any point 612 located inside the Voronoi cell 620 is closer to the point 610 than to any other of the predetermined points in the set 611.An ) example feature vector 630 can thus be quantized (i.e.mapped) to appoint 640 that represents a Voronoi cell 650. One possible choice for selecting the predetermined points in the set 61 is to use the set of points with non-negative integer coordinates that add up to a predetermined value h.This set is defined by the following equation: 5 P = {p E Z" 1 Pi = h,wipi >! 0( 1 (2) where: n is the dimensionality of the subspace containing the points and h is the aforementioned sum, and z-' denotes the set of points with n+1 integer coordinates, and p, denotes the ith coordinate of a point p. 0 These points form a pyramid consisting of a bounded subset of a scaled and translated copy of an A, lattice, where the subscript n refers to the dimension of the subspace spanned by P005110_speci.lodged / 5830629_1 141010 -12 the lattice points, which is one less than the number of coordinates of each point. E.g. P 2
,
2 = { (0,0,2), (0,1,1), (0,2,0), (1,0,1), (1,1,0), (2,0,0) }. The parameter h represents the "height" of the pyramid, which is defined as the maximum value of any coordinate. Points in the set Pnh may be represented in a computer memory by an array of n+1 non-negative integer values whose sum is h. As the sum of the coordinates is known, one of the coordinates is redundant, so an alternative representation is to use an array of n non-negative integers whose sum is not greater than h. Other representations may be used in alternative arrangements. An efficient algorithm for quantizing to the A, lattice is described as Algorithm 4 in APPENDIX A. Quantization error The distance between a mapped vector 112 produced in the step 120 and a corresponding quantized vector 113 produced in the step 130 is the quantization error. The average quantization error typically depends on the distribution of the predetermined points in the set 611 and the shape of the Voronoi cells 620. 5 In the example described above, where the set 611 of predetermined points form the pyramid Pn.? , the number of predetermined points is equal to the binomial coefficient (n t h) (n+h)! nh -" n! h! (2A) where:k! denotes the factorial of a non-negative integer k. The choice of the parameter h therefore controls the quantization error as well as the number of bits required to label the points. Binomial coefficients may also be written using an ) alternative notation. aCb may also be used to denote the Binomial coefficient a choose b or (). Improved mapping Fig. 7 provides an illustration of the subspace resulting from the mapping step 120. Fig. 7 shows a case where the feature vector dimension N = 3 and the subspace dimension n = 2, i.e. !5 three dimensional feature vectors are projected onto a two dimensional simplex which is a triangle. In this example, some of the predetermined points such as 700 lie on a boundary710 of the simplex. As a consequence, the Voronoi cells located at the boundary 710 of the subspace are truncated, e.g. cells 720 and 730.These cells therefore have a smaller volume than Voronoi cells not intersecting the boundary. Quantization efficiency may be improved by 0 adjusting the positions of the predetermined points so there are no points on the boundary of P0051 10_speci lodged / 5830629_1 141010 -13 the simplex. An equivalent approach is to adjust the positions of the mapped vectors before quantizing. This approach is described in more detail hereinafter with reference to Fig. 8. Fig. 8 illustrates how mapped feature vectors are adjusted before quantization by scaling them outwards from the center of the simplex. A scaling factor (D+ a)/D, is used where a is a h parameter controlling the amount of scaling and /nn +1) is the distance from the center of the simplex to the closest point on a face. The centre of a simplex means the point that is equidistant from each of its faces, and a face of a simplex with n+1 vertices is the part of the simplex contained in a hyper-plane passing through any n of its vertices. The choice of a affects the quantization error and the average number of feature vectors per cell. Therefore, the value of a determines accuracy. The value of the parameter a may be chosen to suit a particular application of the DBBI arrangement. Unsealed, mapped feature vectors, such as feature vector 805, are located within the simplex 830. After scaling, the mapped and scaled feature vectors such as the scaled feature vector 807, occupy a larger simplex 840. The scaled vectors are then quantized by mapping them to the closest predetermined point as before. This 5 mapping effectively modifies the Voronoi cells, e.g.810, 820, of points on the boundary of the simplex. The scaled mapped feature vectorO (v) is determined according to the following equation: D (3) 0 Where 1 denotes the N-dimensional vector where each coordinate is 1, (v) is the radial projection(as defined in Equation (1))of a SIFT feature vector v. Since the mapped vectors lie on a hyper-plane that does not include the origin, multiplying the mapped vectors by (D+a)/D(first term in right hand side of Equation (3)) moves the mapped vectors off the hyper-plane, so the scaled vectors need to be shifted back into the 5 hyper-plane by subtracting as indicated by the last term in Equation (3). In the arrangement described with reference to Fig. 8, there is no dimensional reduction, so the number of dimensions of the mapped and scaled feature vectors 437 is the same as the number of dimensions N of the uncompressed feature vectors 400, and the number of dimensions, n, of the subspace is one less than N. In the example depicted in Fig. 8, N= 3 and 0 n= 2. P005110_specilodged / 5830629_1 141010 -14 Algorithm 2 in APPENDIX A describes the complete compression process starting with a SIFT vector v and producing a label using the improved mapping. In Algorithm 2 the scale factor is defined in terms of the quantity P from Fig. 8. The quantities a and 8 are related by the equation: h+f_ D+a - D (3A) h where \/n(n + 1) and n is the dimension of the pyramid. Optimal values for p depend on n. Experiments indicate that for vectors that are uniformly distributed in the simplex bounding the pyramid, the optimal value of p8 is 2(n + 1)n- 2 Rn X0.5626 . For SIFT vectors derived from real images a value of = 2(n + 1)n- 2 Rn x 0.14S5 . Improved quantization When the mapped feature vectors are scaled relative to the lattice so that there are no lattice points on the boundary, another issue arises. The lattice points form an infinite set, and if a scaled feature vector falls outside the simplex 830, the closest lattice point, may lie outside the simplex 830. Therefore, algorithms for finding the closest lattice point may return 5 a lattice point that is not one of the predetermined points. As a result the quantization method needs to be adjusted to ensure that only one of the predetermined points is returned. This problem may be solved by adjusting mapped and scaled feature vectors lying outside the simplex 830 to the nearest point on the boundary of the simplex 830 before quantizing to the lattice. The procedure for performing this adjustment is described as Algorithm I in 0 APPENDIX A. Calculation of labels defined by pyramid point enumeration Figs. 9 and 10 depict the label calculation step 140 of Fig. 1 in more detail. In the DBBI arrangement, a label for a particular predetermined point is created, based on an index that is defined according to a predefined order in the set of predetermined points. Using labels based 5 on such an ordering allows quantized vectors to be represented using the least possible number of bits. If there are P points, then Dog2(P)l bits are required to uniquely represent an arbitrary label. Fig. 9 shows an example in which the predetermined points are counted line by line starting from a reference 900 in the order indicated by the dashed line 911. The first point P0051 10_specilodged / 5830629_1 141010 -15 900 is labelled with a 0. The label for the 17th point 910 is 16. Fig. 9 shows the case where the feature vector dimension N is 3, and the feature vectors have been projected onto a two dimensional simplex (n = 2). In this case, the predetermined points form a triangle. The line-by-line counting enumeration can be extended to arbitrary dimensions, and labels may be determined using a formula based on a sum of binomial coefficients. In the example described above in relation to Fig. 9, the predetermined points are points of a scaled and translated copy of an A 2 lattice, forming a 2-dimensional pyramid P2.s as defined earlier. Fig. 10 illustrates a method of labelling the points in a pyramid according to one arrangement. Fig. 10 depicts the three dimensional pyramid P3,3 which is defined by the equation.
P
3 2 = (p E Z pi = 3,vipi > 0 t J (3B) where the aforementioned pyramid is comprised of four layers containing a total number of points equal to the binomial coefficient value 3+3)=20 (3C) as described in Equation (2A). The four layers 1010, 1020, 1030, and 1040 are shown separately. Each layer consists of a two dimensional pyramid, i.e. a triangle, of points. The 5 first layer consists of a degenerate triangle consisting of a single point 1010. The second layer 1020 consists of a triangle with two points along each side. The third layer 1030consists of a triangle with three points along each side and so on. In general, each layer of an n dimensional pyramid is an n-I dimensional pyramid. The "dimensionality" or "dimension" of a pyramid comprised of points in a Euclidean space refers to the dimensionality of the 0 smallest Euclidean subspace that contains all of the points. Pyramids may be recursively subdivided into pyramids of smaller dimension. According to equation (2), for any positive integers n and h, the coordinates of all of the points in the pyramid Pn.h sum to h, so one coordinate is redundant. If the last coordinate of each point is ignored, the points may be organized into pyramids of dimension n-1, with each point p ?5 belonging to the pyramid Pn-wi, where H is the sum of the first n-1 coordinates of p. The points are labelled according to a sequence starting at a corner of each layer, and each layer is labelled recursively. According to this example, a point 1090 is assigned the label 15. P005110 _specijlodged / 5830629_1 141010 -16 This label can be determined by counting the number of points that precede the point to be labelled in the sequence. The choice of starting corner is arbitrary. The points that precede the point 1090 can be partitioned into a disjoint union of pyramids 1060, 1070 and 1080. 1060 is a three dimensional pyramid comprised of three layers 1010, 1020 and 1030 with three points along each edge, 1070 is a two dimensional pyramid with two points along each side, and 1080 may be considered to be a one dimensional pyramid of height equal tol (where height is shown horizontally for pyramid 1080) with two points along each side. Thus, the label can be expressed as follows: I = 1P 3
,
2 |+ IPI + |PLI = 15. (4) ) where: IJF,21 is the number of points in the three dimensional pyramid 1060, with height equal to 2, IP 2 ,il is the number of points in the two dimensional pyramid 1070 with height equal to 1, and I|,iI is the number of points in the one dimensional pyramid 1080 with height equal to 1. 5 In general, the label I(v) associated with a predetermined point in an n-dimensional pyramid represented by an n+ 1-dimensional vector v may be determined as a sum of binomial coefficients according to the following formula: J(v) = Z P~usp~iI (5) = i+s:(v)-1),wheresi(v)= k k=1 where: s;(v) denotes the sum of the first i coordinates of a vector v. 0 Note that the last coordinate is redundant and is not used in the calculation of the label. In this example, each binomial coefficient in the sum represents the number of points in a pyramid. P0051 10_specijlodged / 5830629_1 141010 -17 Calculation of labels using blocks of bits Fig. 11 depicts an alternative method for determining labels in the label calculation step 140. Fig. 11 shows an example of a quantized feature vector 1110 represented as an array of integer values for coordinates of one of the predetermined points in the pyramid P 6
,
11 . To determine a label 1120, each coordinate in the quantized vector is replaced by a block of bits, where the number of bits is determined by the coordinate value, and the value of each bit is 0 (zero). According to one DBBI arrangement, the number of zero bits equals the value of the coordinate. E.g. the first coordinate of the vector 1110 is 2, which is represented by a block 1130 of two 0 (zero) bits. The blocks of 0 (zero) bits are separated by a I (one) bit such as bits 1140, 1141. Zero coordinate values such as 1142are skipped, and they are represented by empty blocks of 0 (zero) bits such as 1143, but the separating I (one) bits still need to be inserted. Since the sum of the coordinates of each quantized vector in the pyramid is the same, the labels are all the same size because the number of zero bits in each label is equal to the sum of the coordinates and the number of 1 (one) bits is equal to one less than the number of coordinates. Thus each label requires h + n bits. This can be reduced by one more bit because one bit of such a label is redundant; since the number of 1 (one) bits is known, the value of any single bit can be determined by counting the number of I (one) bits in the remaining ) h + n - 1 bits. Equivalently, any single bit can be determined by counting the number of 0 (zero) bits in the remaining h + n - I bits. Therefore the storage of such labels can be reduced by one bit by omitting one bit, for example, the last bit from each label. Unlike labelling methods based on enumeration, such as the pyramid point enumeration method described above, this labelling method is not optimal, but it is close to optimal when h 5 and n are approximately equal. The method also has the advantage of being simple and fast to compute as well as being simple to reverse, i.e. given a label, the coordinates of the quantized vector corresponding to the label can be determined by counting the numbers ofconsecutive0 (zero) bits and consecutive I (one) bits in the label. Many variations of this method are possible such as using blocks of I (one) bits separated 0 by 0 (zero) bits, or the order of the bits may be reversed or permuted. Pseudocode describing the process for determining a label from a vector and the reverse process for calculating a vector from a label using the labelling scheme depicted in Fig. 11. are given as pseudocode in Algorithms 7 and 8 in APPENDIX A. When N=128, a value of P0051 10_specilodged / 5830629_1 141010 -18 h = 133 results in labels that require 256 bits to store and this is close to optimal for compressing SIFT vectors. Calculating labels for more general sets of predetermined points In an alternative arrangement, the set of predetermined points do not form a pyramid. When the set of points can be expressed as a disjoint union of pyramids, the pyramid point enumeration method can be applied by first determining an order on the pyramids and enumerating the points in each sequence using the ordering described for pyramids. Consider the points from three pyramids: S1, S 2 and S3, where each point in Sis assigned a label using the pyramid point enumeration method described above. Each point p2 in S 2 is assigned a label 151 + 4, where 12 is the label of p2 as determined by the pyramid enumeration method applied to S 2 . Finally each point p3 in S3 is assigned a label ISI + Is 2 + 12 , where /3 is the label of p3 as determined by the pyramid enumeration method applied to S3, where ISI denotes the number of elements in the set S. This method can be used to assign labels to any finite set of predetermined points that can be decomposed into disjoint pyramids. Further, the method of labelling may be applied to sets of points that are defined by lattices other than An including,Z",D., D,' and An. In other DBBI arrangements, enumeration methods for other sets of predetermined points that can be expressed in terms of binomial coefficients may be used to define the labels assigned in the step 140. Calculating vectors from labels Compressing feature vectors to form labels has the advantage of reducing storage requirements and speeding up feature comparison, but there is no obvious way to determine if two different labels represent similar feature vectors. While comparing compressed feature vectors (i.e. labels)to compare images is very effective for exact matching of images, it is less 5 useful for approximate matching when finding similar images that are not identical to a query image is required. To locate similar images, it is advantageous to compare the feature vectors using Euclidean distance (or some other distance) as a measure of similarity, rather than comparing the labels. In one DBBI arrangement compressed feature vectors are decompressed by inverting steps 0 of the compression process 100 described in Fig. 1. Having decompressed feature vectors facilitates comparison of feature vectors using Euclidean distances while gaining the storage benefits of having compressed feature vectors. P0051 10_specijlodged / 5830629_1 141010 -19 Fig. 14 depicts decompression, as depicted by a dashed arrow 1411, of compressed feature vectors, and illustrates a process for converting labels to vectors according to a DBBI arrangement. Since the labels determined in the step 140 of Fig. 1 correspond one-to-one with the quantized points determined in the step 130, it is possible to invert the process 140 for calculating labels in order to recover the coordinates of a quantized vector. Given a label 1410, a delabelling process 1420 determines coordinates of a corresponding n+1 dimensional (mapped) quantized vector 1430. The quantized n+1 dimensional vector is then subject to an unmapping process1440 to produce an N-dimensional normalized vector 1450. The unmapping process 1440 is only applicable when an invertible mapping step 120 has been used in Fig. 1, such as the radial projection 0 described earlier in regard to equation 1. If the mapping step 120 is performed by applying the radial projection 0 , then N= n+1 and the unmapped vector 1450corresponding to a (mapped) quantized vector q(e.g.1430) may be determined by dividing each coordinate of the quantized vector 1430by the Euclidean 5 length of q. This is described by the following equation: qI I 1)= - (6) where: H denotes the Euclidean length of the vector q. Note that the normalized feature vector 1450 is only an approximation of the original ) feature vector 400 from which the label 1410 was determined because the information lost during the quantization step 130 cannot be recovered. As a result of the quantization many different feature vectors may compress to the same label. When the labelling method used in the step140 is the method illustrated in Fig. 11, the step 1420 may be performed by counting bits in the label as described earlier. 5 One delabelling method 1420 for efficiently determining the coordinates of a quantized vector given its label as determined by the pyramid point enumeration method is described below. Algorithm 3 in APPENDIX A is pseudocode corresponding to the decompression process depicted in Fig. 14. P0051 10_specijlodged / 5830629_1 141010 -20 Pyramid point coordinate calculation If 1 is the label of a point q in the pyramid Pnah as determined according to the pyramid point enumeration method described above with reference to Figs. 9and 10, then the last coordinate of q may be determined using the following formula: = h-(1+1) (6A) where is the largest integer such that |Pnd li I . (6B) A similar calculation may be performed to determine each of the remaining coordinates of q. To simplify the description of this calculation, it is convenient to define P-1 = 0 , for all integers n, i.e. a pyramid with height -1 has zero points along each side and contains no ) points. Also note that |Pn,o I = |O = 1 , for all n > 0 . (6C) Fig. 15 depicts a delabelling method for determining the coordinates of a (mapped) quantized point q from its label 1. Coordinates are determined in reverse order starting with the last (n+1th) coordinate down to the first coordinate. The method 1500 starts at an initialisation step 1510 where three variables L, H, and i are initialized. The aforementioned 5 variables represent the label, the height of the pyramid and the dimensionality of the pyramid respectively. L is initialized to the label /of the quantized point q which equals the index of q in the enumeration. Initially the height His set to the height h of the full pyramid (one less than the number of points along each side), and the dimension i is set to the dimensionality n of the full pyramid (one less than the number of coordinates of each point). The method then ) proceeds to a step 1520 where the heights of largest (sub) pyramid of the current dimension that does not contain q is determined. The number of points|il in the (sub) pyramid is also determined. The value of the i+1th coordinate is then determined in a step 1530 using the formula qt+ 1 = H - (I + 1). The calculation then proceeds to a step 1540 where the label L is updated by subtracting |i I from it, the height of the pyramid is set toj+1 and the dimension 5 i is decremented by 1. The new value of L is the index of q in a (sub) pyramid of lower dimension representing a layer of the pyramid. In a following step 1550 a test is performed to check if all coordinates have been determined. If this is the case, the method ends, otherwise the method repeats the steps 1520, 1530 and 1540 until all coordinates have been determined. Each time these steps are performed, a new label L is determined representing the index of q 0 in a smaller pyramid representing a layer of a containing pyramid. Pseudocode for this method is provided as Algorithm 5 in Appendix A. P0051 10_specijlodged / 5830629_1 141010 -21 Since l (f 7) = ( i ), (6D) the step1520 of the pyramid point coordinate calculation method 1500 is equivalent to determining an inverse of the binomial coefficient function. According to one DBBI arrangement, an efficient method for determining an inverse of the binomial coefficient function is described below. Algorithms 6 and 6A in APPENDIX A are pseudocode examples of the process described in Fig. 15. Binomial coefficient inversion Determination of the coordinates of a quantized vector such as 1430, given its label 1410, requires solving binomial coefficient inversion problems of the following form: Given positive integers i and L, determine the largest integerj such that (6E) One instance of this problem needs to be solved to determine each coordinate of the quantized vector. 5 To solve an instance of the binomial inversion problem, the required value ofj may be found, for example, by brute force search. Brute force search is efficient for small values of I orj, and has the advantage that binomial coefficients are very efficient to calculate incrementally. However, the brute force approach is not practical for large pyramids due to the large number of coefficients that need to be evaluated. 0 Various continuous approximations are known for binomial coefficients. For example Stirling's approximation may be used to calculate accurate approximations of ( I), but it is difficult to see how to efficiently calculate inverses for any of the known approximations. Numerical approaches for determining the inverse do not seem to be faster than binary search. A method of solving the binomial coefficient inversion problem will now be described, 5 based on a continuous approximation to the binomial coefficient function whose inverse can be efficiently determined. The approximation is used to calculate bounds that are used to reduce the scope of the search, so that only a small number of binomial coefficients need to be evaluated to find the exact solution. The disclosed solution makes it possible to solve the binomial coefficient inversion problem for values of i up to 140 (which thus permits solution 0 forl 28-dimensional SIFT vectors) without the need to determine more than one binomial P0051 10_speci lodged / 5830629_1 141010 -22 coefficient value. For values of i up to 1,000, no more than ten binomial coefficient values need to be determined. The method is effective and provides significant benefit for values of i up to 1,000,000. The number of points in a pyramidPw is equal to the number of points with integer coordinates inside the following simplex: Seij = XrE R Xk 5 j, YkXk !0 IX 6 , (6F) i.e. for any non-negative real numbers i andj, Sij is the set of points with non-negative real valued coordinates, such that the sum of the coordinates of each point is less than, for any non-negative real numbers i andj. If i andj are integers, the points in Sij with integer coordinates are just the points of Pi with the last coordinate removed. The number of points with integer coordinates in Si-, is approximately equal to the volume of the enclosing simplex S.j. Si., is an example of a right simplex, and its volume is straightforward to determine. More precisely, from geometric considerations, the following bounds can be established on the binomial coefficient (i )which is also equal to the number of points in -i,j with integer coordinates: Where J) is the volume of the simplex S.i. To describe the binomial coefficient inversion method in detail, it is convenient to define 0 the following function: F(i +1 + 1) F(i+ 1)F (j+ 1) (8) where: F denotes the Euler Gamma function. 5 Note that P0051 10_specilodged / 5830629_1 141010 -23 Pi~)= (8A) for integer values of i andj. This notation allows the binomial coefficient function to be treated as a parameterized family of functions of one variable, so it makes sense to talk about the function Pi and its inverse PT' for fixed values of the parameter i. Pi and its inverse are real-valued continuous monotonic functions for all positive integers i. For any given integer i, the volume function vi is an approximation to the function Pi as outlined in equation 8.Since it is inexpensive to evaluate the inverse volume functionvi', the inverse volume function serves as a useful approximation to the inverse,Pi , of the binomial function pi. Given the value of the label L and dimension i, and setting L = P.(j), the bounds stated in equation 7can be used to establish upper and lower bounds onj by applying vi1 to both sides of both inequalities and re-arranging terms as shown in equation 9: Vi< vi'L-1 (9) Analysis of the family of functions ki defined by the equation k&j)= vi (p' (j)) (9A) leads to improved bounds. Since vi approximates pi, the function k; approximates the identity function, i.e. ki(j) is approximately equal toj. Therefore the quantityj~'ki(j) indicates the 0 proportional error in the inverse of the function vi as an approximation of the inverse of the function pi for given values of i andj. The Laurent series forj- 1 ki(j) is described as follows: +2 j 24 + +8 (10) P005110_speci lodged / 5830629_1 141010 -24 Consideration of the first two terms of the Laurent series (10) provides an improved bound described as follows: -1()-(i +l~ (11 (i 2 Consideration of the first three terms of the Laurent series provides an upper bound that is a closer approximation defined as follows: ij'i 2 ) - Ii + + ( ]+ 24 (~) (12) where 1 h i(c) a v-11(c) - (2A for positive values of C , and = max ,h(1)- hi(1) 24) (12B) Fig. 13 depicts a method 1300 of binomial coefficient inversion that can be used in the step ) 1520 in Fig. 15.Given the integer values of the label L and dimension i as inputs, the method 1300 starts at a step 1310 by determining the height of a simplex of dimension i with volume L, where L is also the label value, using the formula (i! L) (12C) and where vi 1 (L) is the height of a pyramid of dimension i with volume L. In one DBBI arrangement, the step 1310 may be determined using logarithms using the . following formula: Vf exp rog(i! f) + log(L) v'k(L)= exp ( 13) where: "exp" denotes the exponential function, i.e. exp(x) denotes ex for any real number x, and "log" denotes the natural logarithm function, i.e. the logarithm to base e where "e " is 0 Euler's number. By using a table to represent 10W(!) and standard floating point mathematics functions, the height of the simplex can be determined efficiently and accurately for large values of i and L using only double precision arithmetic. The method then proceeds to a step 1320 where abound for the inverse valuelPTL(L)Iis determined based on the determined height. P005110_specijlodged / 5830629_1 141010 -25 The determined height determines a bound on p 1 (L) and hence also on 1pi (Q). The bound may be any of the bounds described above in equations 9, 11 or 12. Each of these bounds is the solution of a polynomial equation involving the determined height. Other similar bounds may also be used. The method then proceeds to a step 1330 where the exact solution to the inverse problem is found conducting a sequential search. The sequential search is performed by calculating binomial coefficients PA() for successive integer values ofj, starting with the estimated inverse value, and comparing each binomial coefficient against L, until the exact inverse value is found. The largest value ofj for which PM) 5 L is returned as the exact solution. When starting from an estimate based on a lower bound, successive values ) of j are determined by incrementingj. When starting from an estimate based on an upper bound, successive values ofj are determined by decrementingj. Alternatively, a binary search may be used in the step 1330 to find the exact solution using any of the bounds described above to constrain the range of the search. If i is sufficiently small, no searching is required as the bound may be used as an 5 approximation for the parameter values. Thus for example if i is less than 100, the difference between the parameter value and the bound defined by the equation 12is less than I.Therefore step 1330 is not required and may be omitted. In that case, the parameters is equal to the floor of the bound. After the step 1330, the process 1300 ends. Application to image retrieval Several DBBI arrangements will now be described with reference to Figs.12A, 12B, 12C and 12D to illustrate possible trade-offs between storage space, accuracy and retrieval speed for image retrieval applications. Fig. 12A depicts an example database containing four images 1201, 1202, 1203 and 1204. 5 Feature vectors Vuni, Vsun2. Vsun3, Vsun4, Vhouse, Vmousel, Vmouse2, Vcai, and Vca2 are extracted from these four images according, for example to the step 110 in Fig. 1. In this example, the feature vectors are compressed 1205, according for example to the steps 120, 130 and 140, to three labels 11,12 and 13. This illustrates (a) the fact that each image may be represented by more than one feature vector, and (b) that distinct feature vectors may be quantized to the 0 same predetermined point (i.e. label). A database, such as 299 in Fig. 2 stores the labels l, 12 and /3 and associates each label with the corresponding image from which the quantized feature vector was taken. E.g. the label 12 is associated with three images 1201, 1202, and 1203. The actual images, such as 1201, 1202 and 1203 may be included in the same database P00511 0_specilodged / 5830629_1 141010 -26 299, or they maybe stored in separate storage, in which case only the index or point of each image is stored in the database 299and associated with its corresponding labels. To find a match, according to the step 340 in Fig. 3, for a query image 1200 (also see 310 in Fig. 3), two feature vectors Vqi and Vq2 are extracted, according to the step 320 in Fig. 3, from the query image1200 and these feature vectors are reduced (compressed) to labels lqI, and 1 q2 respectively according to the process described in regard to Fig. 1. The labels {Iqi, lq2 }are used to find, according to the step 340 in Fig. 3, images that match the query by executing a database search to find images in the database 299 with matching labels. In the simplest approach, the entire database 299 is searched. For each image represented in the database299, the set of labels derived from the image is compared to the set of labels {lI, lq2}derived from the query image 1200 to determine if they match. I.e. the set {ql, lq2} is compared to each of the sets {i, 12), {1, 12, /3} and {/3). Using this approach, it is found that both the image 1201 and the image1202 match the labels {/qI, 1 q2} that are derived from the query image 1200. The image1201 is returned as an exact match to the searched image 1200 because the labels {Iql, lq2} in the query can be matched up one-to-one with the labels derived from the image 1201 i.e. 1qi, 1 q2, match l, and 12 respectively. The next highest match score is found for images 1200 and 1202, as two of the three labels are found to match and are identical. As a result of the match score, the images 1200 and 1202may be classified as having a partial correspondence. In this approach, comparison of features is fast, since image ) features are represented by labels and labels are compact in size. The approach also has a further advantage, in that the computer memory is able to store more labels than feature vectors due to the compressed size of the labels, and so labels can be accessed faster than feature vectors because of the faster access to memory. For exact matching applications, feature vectors do not need to be stored in the database. 5 Since identical images produce identical labels, matching images can be found by directly comparing labels derived from the query image with labels derived from images in the database. The advantages provided by using labels instead of feature vectors apply also to more sophisticated inexact matching methods. If approximate matching of images is required, such as when looking for similar images, exact matching of labels cannot always be used. If 0 the query image is only an approximate match to a database image, then the feature vectors derived from the query image may not be exactly the same as those derived from the database image. The query image and the corresponding database image may generate different labels P0051 10_specilodged / 5830629_1 141010 -27 even when the feature vectors are very similar. In this case, the matching image may not be found using exact matching of labels. For approximate image matching, where database images similar to the query image are to be found, a different approach is required. One approach, known as "point perturbation", may be applied to any hash-based matching method. The approach generates multiple hash values from a single feature vector by perturbing the feature vector. The feature vector is perturbed by adding a small randomly generated vector to move the feature vector before calculating the hash. The perturbation is done multiple times to generate multiple hash values for each feature vector. Labels derived from the perturbed feature vectors are then matched to labels stored in the database. Point perturbation increases the likelihood of finding matching images when an approximate match is required, but the cost of searching is increased due to the extra comparisons required for each perturbed feature vector. Other DBBI arrangements use other measures of closeness between feature vectors to find matching feature vectors. The efficient means to convert labels back to vectors described 5 above in regard to Fig. 14.allows matching to be performed using matching techniques that rely on the Euclidean distance (or other distance)as a measure of closeness between feature vectors. The criteria used for determining a match between two labels can involve a predefined distance threshold. Fig. 12B and Fig. 14 illustrate how inexact matching may be performed using one DBBI ) arrangement. In this DBBI arrangement, N-dimensional feature vectors Vqi and Vq 2 are extracted from the query image 1200. To match the feature vectors {Vqi , Vq 2 } to features of images represented in the database 299, labels 1;, 12, and l3 are retrieved from the database 299 for the images 1201-1204, 1t, /2, and 13 are converted to corresponding vectors by the delabelling process 1420and the unmapping process 1440described in relation to Fig. 14, to 5 produce the corresponding N-dimensional vectors VI, V 2 and V 3 . This allows the labels li, 12 and 13 to be converted to vectors V 1 , V 2 and V 3 which can be directly compared to the feature vectors Vq 1 and Vq 2 . Since conversion from labels to vectors can be performed efficiently, the vectors VI, V 2 and V 3 do not need to be stored and can be re-determined from stored labels as required. 0 In another DBBI arrangement shown in Fig. 12C, the vectors Vqi and Vq 2 are mapped to a Euclidean subspace such as the subspace 425 described in relation to Fig. 4. The mapping may be performed using any of the mapping methods previously described to form mapped query vectors Mqi and Mq 2 . The mapped query vectors M.1 and Mq2 are compared against P0051 10_specilodged / 5830629_1 141010 -28 vectorsQi,Q 2 and Q3, which are generated by the delabelling process 1420when applied to the labels 1i, 12 and 13, respectively. Comparison of the vectors Mqt and Mq2in the Euclidean subspace is faster because the unmapping step does not have to be performed, but the comparisons are less accurate than direct comparisons of unmapped feature vectors because the mapping step distorts distances. In another DBBI arrangement, illustrated in Fig. 12D, a difference between each mapped feature vector 437 and a corresponding quantized vector 439 is determined and stored in the database. For example, a difference D1 is determined between the mapped feature vector Mun, and the quantized vector Qi. An association is made in the database between each stored ) difference, the image it was derived from and the label determined from the corresponding quantized vector. In this example, comparing the image 1200 against image 1201, the feature vector V..i, is recovered by delabelling its label li to recover the corresponding quantized vector Q, and adding the stored difference Di to Qi to recover the mapped feature vector M.1, and applying the unmapping process 1440 to the mapped vector Muni to produce the i vector Vu,,i. Similarly, the feature vector V,,,n 2 , is recovered by delabelling the label 12 to recover the corresponding quantized vector Q2 and adding the stored difference D 3 to Q2 to recover the mapped vector Msun2 and applying the unmapping process 1440 to the mapped vector Msun2 to produce the vector Vun 2 . The advantage of this arrangement is that matching is more accurate than the techniques described in relation to Figs. 12A to 12C since the ) recovered feature vectors are not approximated by quantized versions making it possible, for example, to distinguish between the feature vectors Vcat,, Vcat2, and Vhouse, even though all three vectors quantize to Q3. While some storage is required per feature vector, less storage is required than for the uncompressed feature vectors, because the difference vectors are in general smaller than the feature vectors, so fewer bits can be allocated to each coordinate 5 without loss of accuracy. Application to video and image compression The DBBI arrangements are also applicable to video and image compression. Transform coding and lattice vector quantization are used for video and still image compression.Image data may be compressed by dividing the image into blocks, using transform coding to 0 transform the blocks, quantizing the transformed blocks using a lattice vector quantizer and calculating labels representing the quantized values. The labels may be used as a compressed representation of the image. This process may be reversed to decompress the compressed P0051 10_specilodged / 5830629_1 141010 -29 image and recover a close approximation of the original image. An example of this process will now be described in more detail. Statistics of images, after applying transform coding (e.g. DCT or wavelet coding), can be modelled by memoryless Laplacian sources. Lattice vector quantization can be used for compressing such sources. Thus, for example, if the lattice is the translated lattice 2Z" + 1 then the set of possible quantized values or codes will be a set of the following form: Q = E 2Z" + 1 }xl 4 M for some integer M. The set Q may be partitioned into 2" pyramids depending on the signs of the coordinates of the point, and a two part label may be assigned to each point. The first part of the label is an n-bit binary number indicating which pyramid the point belongs to, and the second part ) indicates the location of the point in the pyramid. Each bit of the first part of the label indicates the sign of the corresponding coordinate of the point. Thus, for example, if the second coordinate of the point is negative, then the second bit is set to zero, otherwise the second bit is set to 1. The correspondence between points in Q and points in the pyramid Pnt is defined by the following formula: (lxill- .1|xj- 1 Irnl - 1, m(X)=(* 2 'I 2 ''"'
-
, n = 1 [M (15) where i=12 and L 2 5 For any point X E Q , m(X) E Pmi-, the second part of the label for x may be calculated according to the following equation: 1 2 (x) = I(m(x)) (16) Where 1 is the labelling function defined in equation(5). Given a two part label for a code X E Q , m(X) may be recovered from the second part using the binomial coefficient inversion method described above, and accordingly, by using 0 the signs of the coordinates defined by the first part, it is straightforward to calculate x. Vectors that are quantized using other lattices may be similarly represented using labels defined in terms of binomial coefficients. Accordingly, application of the DBBI arrangements to image and video compression is not limited to lattice vector quantization based only on the translated lattice2Z" + 1 . P005110_speci lodged / 5830629_1 141010 -30 Application to document identification by keyword matching The DBBI arrangements are not limited to image based applications. Thus, for example, the DBBI arrangements may be applied to identification of documents using keywords. An example of such an application will now be described. Given a predefined set K of keywords of size N, the keywords present in a document may be encoded as a binary vector of N bits where each bit of the vector indicates the presence or absence of a keyword. Thus, for example, if the third keyword is present, the third bit in the binary vector will be set to 1 and if the third keyword is not present, the third bit will be set to 0. If a document contains n, keywords, then the binary vector associated with the document will have n non-zero bits. The binary vector may be converted into a vector of integers of length n+1 by counting the numbers of consecutive 0 bits. There will be n+1 blocks of consecutive 0 bits, each block corresponding to a single coordinate in the vector of integers. The total number of zero bits in the binary vector is N-n, so the integer vector belongs to the pyramid Pn,.%-n. Such a vector may be converted to a numeric label using equation 5. 5 Thus, given a database of documents each containing n of N available keywords, an index may be produced comprised of a set of entries, each entry consisting of a document identifier and a numeric label representing the set of keywords present in the document corresponding to the identifier. The DBBI arrangements allow such an index to be efficiently searched to find the document identifiers of any documents containing any particular set of keywords. ) Given a set of query keywords represented as a binary vector, documents containing the keywords may be found by retrieving each numeric label from each index entry and decompressing it to form a binary vector representing the keywords in the document identified by the entry. By comparing the binary vector of query keywords to the binary vector of keywords in the document, documents containing all of the query keywords may be 5 identified. The numeric labels are decompressed to produce binary vectors by reversing the steps used to producing the binary vectors. The pyramid point calculation method described above may be used to efficiently calculate the integer vector corresponding to each numeric label. Figs. 16A and 16B depict a general-purpose computer system 1600, upon which the 0 various DBBI arrangements described can be practiced. As seen in Fig. 16A, the computer system 1600 includes: a computer module 1601; input devices such as a keyboard 1602, a mouse pointer device 1603, a scanner 1626, a camera 1627, and a microphone 1680; and output devices including a printer 1615, a display P0051 10_specijlodged / 5830629_1 141010 -31 device 1614 and loudspeakers 1617. An external Modulator-Demodulator (Modem) transceiver device 1616 may be used by the computer module 1601 for communicating to and from a communications network 1620 via a connection 1621. The communications network 1620 may be a wide-area network (WAN), such as the Internet, a cellular telecommunications network, or a private WAN. Where the connection 1621 is a telephone line, the modem 1616 may be a traditional "dial-up" modem. Alternatively, where the connection 1621 is a high capacity (e.g., cable) connection, the modem 1616 may be a broadband modem. A wireless modem may also be used for wireless. connection to the communications network 1620. The computer module 1601 typically includes at least one processor unit 1605, and a memory unit 1606. For example, the memory unit 1606 may have semiconductor random access memory (RAM) and semiconductor read only memory (ROM). The computer module 1601 also includes an number of input/output (1/0) interfaces including: an audio-video interface 1607 that couples to the video display 1614, loudspeakers 1617 and microphone 1680; an I/O interface 1613 that couples to the keyboard 1602, mouse 1603, scanner 1626, 5 camera 1627 and optionally a joystick or other human interface device (not illustrated); and an interface 1608 for the external modem 1616 and printer 1615. In some implementations, the modem 1616 may be incorporated within the computer module 1601, for example within the interface 1608. The computer module 1601 also has a local network interface 1611, which permits coupling of the computer system 1600 via a connection 1623 to a local-area ) communications network 1622, known as a Local Area Network (LAN). As illustrated in Fig. 16A, the local communications network 1622 may also couple to the wide network 1620 via a connection 1624, which would typically include a so-called "firewall" device or device of similar functionality. The local network interface 1611 may comprise an Ethernetim circuit card, a BluetoothTM wireless arrangement or an IEEE 802.11 wireless arrangement; however, 5 numerous other types of interfaces may be practiced for the interface 1611. The 1/O interfaces 1608 and 1613 may afford either or both of serial and parallel connectivity, the former typically being implemented according to the Universal Serial Bus (USB) standards and having corresponding USB connectors (not illustrated). Storage devices 1609 are provided and typically include a hard disk drive (HDD) 1610. Other storage devices 0 such as a floppy disk drive and a magnetic tape drive (not illustrated) may also be used. An optical disk drive 1612 is typically provided to act as a non-volatile source of data. Portable memory devices, such optical disks (e.g., CD-ROM, DVD, Blu ray DiscTm), USB-RAM, P0051 10_ speci lodged / 5830629_1 141010 -32 portable, external hard drives, and floppy disks, for example, may be used as appropriate sources of data to the system 1600. The components 1605 to 1613 of the computer module 1601 typically communicate via an interconnected bus 1604 and in a manner that results in a conventional mode of operation of the computer system 1600 known to those in the relevant art. For example, the processor 1605 is coupled to the system bus 1604 using a connection 1618. Likewise, the memory 1606 and optical disk drive 1612 are coupled to the system bus 1604 by connections 1619. Examples of computers on which the described arrangements can be practised include IBM-PC's and compatibles, Sun Sparcstations, Apple MacTM or a like computer systems. The DBBI methods may be implemented using the computer system 1600 wherein the processes of Figs. 1, 13 and 15, described above, may be implemented as one or more software application programs 1633 executable within the computer system 1600. In particular, the steps of the DBBI methods are effected by instructions 1631 (see Fig. 16B) in the software 1633 that are carried out within the computer system 1600. The software i instructions 1631 may be formed as one or more code modules, each for performing one or more particular tasks. The software may also be divided into two separate parts, in which a first part and the corresponding code modules performs the DBBI methods, and a second part and the corresponding code modules manage a user interface between the first part and the user. ) The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the computer system 1600 from the computer readable medium, and then executed by the computer system 1600. A computer readable medium having such software or computer program recorded on the computer readable medium is a computer program product. The use of the computer program 5 product in the computer system 1600 preferably effects an advantageous apparatus for performing the DBBI methods. The software 1633 is typically stored in the HDD 1610 or the memory 1606. The software is loaded into the computer system 1600 from a computer readable medium, and executed by the computer system 1600. Thus, for example, the software 1633 may be stored 0 on an optically readable disk storage medium (e.g., CD-ROM) 1625 that is read by the optical disk drive 1612. A computer readable medium having such software or computer program recorded on it is a computer program product. The use of the computer program product in the computer system 1600 preferably effects a DBBI apparatus. P0051 10_specijodged / 5830629_1 141010 -33 In some instances, the application programs 1633 may be supplied to the user encoded on one or more CD-ROMs 1625 and read via the corresponding drive 1612, or alternatively may be read by the user from the networks 1620 or 1622. Still further, the software can also be loaded into the computer system 1600 from other computer readable media. Computer readable storage media refers to any non-transitory tangible storage medium that provides recorded instructions and/or data to the computer system 1600 for execution and/or processing. Examples of such storage media include floppy disks, magnetic tape, CD-ROM, DVD, Blu-ray Disc, a hard disk drive, a ROM or integrated circuit, USB memory, a magneto optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the computer module 1601. Examples of transitory or non-tangible computer readable transmission media that may also participate in the provision of software, application programs, instructions and/or data to the computer module 1601 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions 5 and information recorded on Websites and the like. The second part of the application programs 1633 and the corresponding code modules mentioned above may be executed to implement one or more graphical user interfaces (GUIs) to be rendered or otherwise represented upon the display 1614. Through manipulation of typically the keyboard 1602 and the mouse 1603, a user of the computer ) system 1600 and the application may manipulate the interface in a functionally adaptable manner to provide controlling commands and/or input to the applications associated with the GUI(s). Other forms of functionally adaptable user interfaces may also be implemented, such as an audio interface utilizing speech prompts output via the loudspeakers 1617 and user voice commands input via the microphone 1680. 5 Fig. 16B is a detailed schematic block diagram of the processor 1605 and a "memory" 1634. The memory 1634 represents a logical aggregation of all the memory modules (including the HDD 1609 and semiconductor memory 1606) that can be accessed by the computer module 1601 in Fig. 16A. When the computer module 1601 is initially powered up, a power-on self-test (POST) 0 program 1650 executes. The POST program 1650 is typically stored in a ROM 1649 of the semiconductor memory 1606 of Fig. 16A. A hardware device such as the ROM 1649 storing software is sometimes referred to as firmware. The POST program 1650 examines hardware within the computer module 1601 to ensure proper functioning and typically checks the P0051 10_specijlodged / 5830629_1 141010 -34 processor 1605, the memory 1634 (1609, 1606), and a basic input-output systems software (BIOS) module 1651, also typically stored in the ROM 1649, for correct operation. Once the POST program 1650 has run successfully, the BIOS 1651 activates the hard disk drive 1610 of Fig. 16A. Activation of the hard disk drive 1610 causes a bootstrap loader program 1652 that is resident on the hard disk drive 1610 to execute via the processor 1605. This loads an operating system 1653 into the RAM memory 1606, upon which the operating system 1653 commences operation. The operating system 1653 is a system level application, executable by the processor 1605, to fulfil various high level functions, including processor management, memory management, device management, storage management, software application ) interface, and generic user interface. The operating system 1653 manages the memory 1634 (1609, 1606) to ensure that each process or application running on the computer module 1601 has sufficient memory in which to execute without colliding with memory allocated to another process. Furthermore, the different types of memory available in the system 1600 of Fig. 16A must be used properly i so that each process can run effectively. Accordingly, the aggregated memory 1634 is not intended to illustrate how particular segments of memory are allocated (unless otherwise stated), but rather to provide a general view of the memory accessible by the computer system 1600 and how such is used. As shown in Fig. 16B, the processor 1605 includes a number of functional modules ) including a control unit 1639, an arithmetic logic unit (ALU) 1640, and a local or internal memory 1648, sometimes called a cache memory. The cache memory 1648 typically includes a number of storage registers 1644 - 1646 in a register section. One or more internal busses 1641 functionally interconnect these functional modules. The processor 1605 typically also has one or more interfaces 1642 for communicating with external devices via the system bus 5 1604, using a connection 1618. The memory 1634 is coupled to the bus 1604 using a connection 1619. The application program 1633 includes a sequence of instructions 1631 that may include conditional branch and loop instructions. The program 1633 may also include data 1632 which is used in execution of the program 1633. The instructions 1631 and the data 0 1632 are stored in memory locations 1628, 1629, 1630 and 1635, 1636, 1637, respectively. Depending upon the relative size of the instructions 1631 and the memory locations 1628 1630, a particular instruction may be stored in a single memory location as depicted by the instruction shown in the memory location 1630. Alternately, an instruction may be P0051 10_specijlodged / 5830629_1 141010 -35 segmented into a number of parts each of which is stored in a separate memory location, as depicted by the instruction segments shown in the memory locations 1628 and 1629. In general, the processor 1605 is given a set of instructions which are executed therein. The processor 1105 waits for a subsequent input, to which the processor 1605 reacts to by executing another set of instructions. Each input may be provided from one or more of a number of sources, including data generated by one or more of the input devices 1602, 1603, data received from an external source across one of the networks 1620, 1602, data retrieved from one of the storage devices 1606, 1609 or data retrieved from a storage medium 1625 inserted into the corresponding reader 1612, all depicted in Fig. 16A. The execution of a set of the instructions may in some cases result in output of data. Execution may also involve storing data or variables to the memory 1634. The disclosed DBBI arrangements use input variables 1654, which are stored in the memory 1634 in corresponding memory locations 1655, 1656, 1657. The DBBI arrangements produce output variables 1661, which are stored in the memory 1634 in corresponding memory locations 1662, 1663, 1664. Intermediate variables 1658 may be stored in memory locations 1659, 1660, 1666 and 1667. Referring to the processor 1605 of Fig. 16B, the registers 1644, 1645, 1646, the arithmetic logic unit (A LU) 1640, and the control unit 1639 work together to perform sequences of micro-operations needed to perform "fetch, decode, and execute" cycles for ) every instruction in the instruction set making up the program 1633. Each fetch, decode, and execute cycle comprises: (a) a fetch operation, which fetches or reads an instruction 1631 from a memory location 1628, 1629, 1630; (b) a decode operation in which the control unit 1639 determines which instruction has 5 been fetched; and (c) an execute operation in which the control unit 1639 and/or the ALU 1640 execute the instruction. Thereafter, a further fetch, decode, and execute cycle for the next instruction may be executed. Similarly, a store cycle may be performed by which the control unit 1639 stores or 0 writes a value to a memory location 1632. Each step or sub-process in the processes of Figs. 1, 13 and 15 is associated with one or more segments of the program 1633 and is performed by the register section 1644, 1645, 1647, the ALU 1640, and the control unit 1639 in the processor 1605 working together to P0051 10_specijlodged / 5830629_1 141010 -36 perform the fetch, decode, and execute cycles for every instruction in the instruction set for the noted segments of the program 1633. The DBBI method may alternatively be implemented in dedicated hardware such as one or more integrated circuits performing the DBBI functions or sub functions. Such dedicated hardware may include graphic processors, digital signal processors, or one or more microprocessors and associated memories. Industrial Applicability The arrangements described are applicable to the computer and data processing industries and particularly those areas dealing with image retrieval. ) The foregoing describes only some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiments being illustrative and not restrictive. In the context of this specification, the word "comprising" means "including principally but not necessarily solely" or "having" or "including", and not "consisting only of". Variations of the word "comprising", such as "comprise" and "comprises" have correspondingly varied meanings. P0051 10_specilodged / 5830629_1 141010 -37 APPENDIX AALGORITHMS Algorithm 1 projectBackIn Given a vector, v consisting of N coordinates with i-th coordinate denoted by vi, the following procedure modifies the coordinates so that v is adjusted to the closest point in the simplex, i.e. the closest point with non-negative coordinates, such that the coordinates sum to the same value as the coordinates of v. 1. Letw=;c=0 2. Fori=1 toN 2. 1. Ifv < 0 , then 2.1.1. Setw=w-vi 2.1.2. Set Vi 0 2.2. else 2.2.1. Let c c + 1 3. While W > 0 5 3.1. Let u =w/ c 3.2. For i= I to N 3.2.1. Ifvi > 0 , then 3.2.1.1. Set vi = vi - I 3.2.2. If Vi < 0 , then ) 3.2.2.1. Set w" = w? - v 3.2.2.2. Set vi = 0 3.2.3. Else if Vi> 0 , then 3.2.3.1. Let C = C + 1 Points that lie outside the simplex are those that have at least one negative coordinate. 5 Algorithm 2 compress Given an N dimensional vector, v , determine a label, I , to represent v 1. Lety = v N 2. Let v' = projectBackln(v') using Algorithm 1. 3. Let p = quantize(v') using Algorithm 4. 0 4. Let I = label(p)using Algorithm 5. P0051 10_speci lodged / 5830629_1 141010 -38 N is 128 when v is a SIFT vectors but may be different for other types of image feature vectors. Algorithm 3 decompress Given a label, 1 , determine the represented N dimensional vector, v i 1. Let v" = delabel(l) using Algorithm 6. 2. Let v =v' 3. Let v =Iv' 1 N is 128 when v is a SIFT vectors but may be different for other types of image feature vectors. ) Algorithm 4 quantize Given a 128 dimensional vector, v , whose coordinates sum to h, determine the nearest point, P , in P127.. 1. Let in h 5 2. For = I to 128 2.1. Let Pi =Li 1 2.2. Let fi vi - Pi 2.3. Let m = m - Pi 3. If m > O , then 3 3.1. Let c be the set ofm coordinates of f with the largest values. 3.2. For each i in c 3.2. 1. Setp; = Pi +I1 In this example, the vector dimension Nis set to 128 as for when v is a SIFT vectors but may be different for other types of image feature vectors. !5 Algorithm 5 label Given a vector, P in Pn,h, determine its label, . 1. Let s 1 = 0 2. For i = 1to n 30 2.1. Lets= s+Pi 2.2. LetJ = I + i+s-IC P005110 speci lodged/ 5830629_1 141010 -39 Algorithm 6 delabel The following pyramid point coordinate method calculates the coordinates of a point q in the pyramidPn.& given its label 1. The label I is assumed to have been determined using pyramid point enumeration. I1. LetL=l;H=h 2. For i =n down to 0 2.1. Find the largest integerj such that |PijI e L 2.2. Set q,+ 1 = H - (j+1) 2.3. Set H =j+1 ) 2.4. Set L L -|IPtjI Note that in step 2.1, when i = 0,j is set to -1 according to the convention that Po.-I is the empty set. Algorithm 6A delabel (alternative description) 5 Given a label, I , determine its point, P , in Pn.ha. 1. Let H = h 2. Fori ntoO 2.1. Find s such that "sci s 1 < i"+Ii 2.2. Let
P
i.+ = H - (s +1) ) 2.3. Let H = s+1 2.4. Let I = I - +"C Note that in step 2.1, when i = 0, j is set to -1 according to the convention that "Cb = 0 whenever b > a . 5 Algorithm 7 unary label Given a vector, P in P127,, determine its label, . P0051 10_speci lodged / 5830629_1 141010 -40 1. Letm=h+1 2 6 2. Let t be a buffer of length m bits, with all bits set to zero 3. LetS 0 4. For =I to 127 4.1. Set S = S + Pi+ 1 4.2.If s5 sm then set is Algorithm 8 unary delabel Given a label, , determine its point, P , in Piz7.h. 1. Letm=h+ 1 2 6 ) 2. Let i = 1 3. Set P 1 2 8 = h 4. For j = I to 127 4.1. Set Pj = 0 4.2. While l = 0 and i5 sm 5 4.2.1. Seti=i+i 4.2.2. Set Pi = PJ + 4.3. Set P128 - P128 - PI 4.4. Set I + 1 P0051 10_speci lodged / 5830629_1 141010

Claims (18)

1. A method of determining a coordinate of a normalized feature vector representing a portion of an image, where the feature vector is represented by a numerical identifier, the method comprising the steps of: (a) finding a largest binomial coefficient whose value is less than or equal to the numerical identifier, the largest binomial coefficient being defined by a value of a parameter where the parameter is determined by: ) (i) determining a bound for the parameter, said bound being based on a height of a simplex having a volume equal to the numerical identifier; (ii) calculating a binomial coefficient from the determined bound; (iii) finding the value of the parameter by comparing the determined binomial coefficient against the numerical identifier; and 5 (b) determining the coordinate of the feature vector based on the parameter value defining said largest binomial coefficient, said feature vector representing the portion of the image.
2. A method of determining a coordinate of a quantised feature vector representing a ) portion of an image, where the quantised feature vector is represented by a numerical identifier, the method comprising the steps of: (a) determining the height of a simplex S.ihaving a volume equal to the numerical identifier; (b) determining,dependent upon the height of said simplex, a bound for a 5 parameter defining a corresponding binomial coefficient; (c) determining, constrained by the determined bound, the largest value of the parameter such that the value of the corresponding binomial coefficient is the largest value less than or equal to the numerical identifier; and (d) determining the coordinate of the quantised feature vector based the largest 0 value of the parameter, said quantised feature vector representing the portion of the image.
3. The method according to claim 2, comprising the further step of: P0051 10_specilodged / 5830629_1 141010 -42 unmapping the quantised feature vector to determine a normalized feature vector corresponding to a key point in said portion of the image.
4 The method according to claim 2, wherein the numerical identifier is constructed according to the steps of: mapping, using a gnomonic projection, a feature vector of dimension N to a mapped feature vector in a sub-space of dimension less than N; quantising the mapped feature vector to one of a finite set of quantised points in said sub-space; and assigning an identifier to said one of the finite set of quantised points, said identifier being said numerical identifier.
5. The method according to claim 2 comprising the further steps of: (c) updating said numerical identifier based on said parameter value; and (d) repeating the steps (a), (b), (c) and (d) using the updated numerical identifier as the numerical identifier, to determine further coordinate values for the quantised feature vector.
6. The method according to claim 2 comprising the further steps of: (c) updating said numerical identifier based on said parameter value; and ) (d) repeating steps (a), (b), (c) and (d) using the updated numerical identifier as the numerical identifier, to determine further coordinate values for the quantised feature vector.
7. The method of claim 2 wherein the step of determining the bound comprises solving a polynomial equation with coefficients determined from the height of said simplex. 5
8. The method of claim 2 wherein the step of determining the largest value of the parametercomprises conducting a sequential search starting at said bound.
9. The method of claim 2 wherein the step of determining the largest value of the 0 parametercomprises conducting a binary search with range constrained by said bound. P0051 10_specilodged / 5830629_1 141010 -43
10. The method of claim 9 wherein said binary search is conducted with a range constrained by upper and lower bounds, where both the upper and lower bounds are determined from the height of the simplex.
11. The method of claim 2 wherein the height of the simplex is determined from a logarithm of the numerical identifier using double precision floating point arithmetic.
12. A method of determining a coordinate of a feature vector representing a portion of an image where the feature vector is represented by a numerical identifier, the method ) comprising the steps of: (a) determining a bound for a parameter defining a largest binomial coefficient whose value is less than or equal to the numerical identifier; said bound being determined from the height of a simplex whose volume is equal to the value of the numerical identifier; (b) determining said parameter from said bound and (c) determining a coordinate of the feature vector based on the parameter.
13. An apparatus for determining a coordinate of a quantised feature vector representing a portion of an image, where the quantised feature vector is represented by a numerical identifier, the apparatus comprising: ) (a) means for determining the height of a simplex Sia. having a volume equal to the numerical identifier; (b) means for determining, dependent upon the height of said simplex, a bound for a parameter defining a corresponding binomial coefficient; (c) means for determining, constrained by the determined bound, the largest value 5 of the parameter such that the value of the corresponding binomial coefficient is the largest value less than or equal to the numerical identifier; and (d) means for determining the coordinate of the quantised feature vector based on the inverse of said largest binomial coefficient, said quantised feature vector representing the portion of the image. 0
14. A computer readable storage medium having a computer program recorded therein, the program being executable by a computer apparatus to make the computer perform a method for determining a coordinate of a quantised feature vector representing a portion of an P0051 10_specijlodged / 5830629_1 141010 -44 image, where the quantised feature vector is represented by a numerical identifier, the program comprising: (a) computer software code for determining the height of a simplex having a volume equal to the numerical identifier; (b) computer software code for determining, dependent upon the height of said simplex, a bound for a parameter defining a corresponding binomial coefficient; (c) computer software code for determining, constrained by the determined bound, the largest value of the parameter such that the value of the corresponding binomial coefficient is the largest value less than or equal to the numerical identifier; and (d) computer software code for determining the coordinate of the quantised feature vector based on the inverse of said largest binomial coefficient, said quantised feature vector representing the portion of the image.
15. An apparatus for determining a coordinate of a quantised feature vector representing a portion of an image, where the quantised feature vector is represented by a numerical identifier, the apparatus comprising: a processor; and a memory in which is stored a program configured to direct the processor to perform a method comprising the steps of: ) (a) determining the height of a simplex having a volume equal to the numerical identifier; (b) determining, dependent upon the height of said simplex, a bound for a parameter defining a corresponding binomial coefficient; (c) determining, constrained by the determined bound, the largest value of the 5 parameter such that the value of the corresponding binomial coefficient is the largest value less than or equal to the numerical identifier; and (d) determining the coordinate of the quantised feature vector based on the inverse of said largest binomial coefficient, said quantised feature vector representing the portion of the image. 0
16. A method of determining a coordinate of a quantised feature vector, substantially as described herein with reference to the accompanying drawings. P0051 10_specilodged / 5830629_1 141010 -45
17. An apparatus for determining a coordinate of a quantised feature vector,substantially as described herein with reference to the accompanying drawings.
18. A computer readable storage medium having a computer program recorded therein, substantially as described herein with reference to the accompanying drawings. DATED this 14th Day of December 2011 CANON KABUSHIKI KAISHA Patent Attorneys for the Applicant SPRUSON&FERGUSON P0051 10_specilodged / 5830629_1 141010
AU2011254041A 2011-12-14 2011-12-14 Compression of sift vectors for image matching Abandoned AU2011254041A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2011254041A AU2011254041A1 (en) 2011-12-14 2011-12-14 Compression of sift vectors for image matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2011254041A AU2011254041A1 (en) 2011-12-14 2011-12-14 Compression of sift vectors for image matching

Publications (1)

Publication Number Publication Date
AU2011254041A1 true AU2011254041A1 (en) 2013-07-04

Family

ID=48700105

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2011254041A Abandoned AU2011254041A1 (en) 2011-12-14 2011-12-14 Compression of sift vectors for image matching

Country Status (1)

Country Link
AU (1) AU2011254041A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829851A (en) * 2019-01-17 2019-05-31 厦门大学 A kind of Panorama Mosaic method and storage equipment based on spherical surface alignment estimation
CN110688502A (en) * 2019-09-09 2020-01-14 重庆邮电大学 Image retrieval method and storage medium based on depth hash and quantization
CN112668632A (en) * 2020-12-25 2021-04-16 浙江大华技术股份有限公司 Data processing method and device, computer equipment and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829851A (en) * 2019-01-17 2019-05-31 厦门大学 A kind of Panorama Mosaic method and storage equipment based on spherical surface alignment estimation
CN110688502A (en) * 2019-09-09 2020-01-14 重庆邮电大学 Image retrieval method and storage medium based on depth hash and quantization
CN110688502B (en) * 2019-09-09 2022-12-27 重庆邮电大学 Image retrieval method and storage medium based on depth hash and quantization
CN112668632A (en) * 2020-12-25 2021-04-16 浙江大华技术股份有限公司 Data processing method and device, computer equipment and storage medium
CN112668632B (en) * 2020-12-25 2022-04-08 浙江大华技术股份有限公司 Data processing method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
Duan et al. Overview of the MPEG-CDVS standard
Kekre et al. Image retrieval using augmented block truncation coding techniques
He et al. Mobile product search with bag of hash bits and boundary reranking
US8913853B2 (en) Image retrieval system and method
US6961736B1 (en) Compact color feature vector representation
TWI506459B (en) Content-based image search
US10062083B2 (en) Method and system for clustering and classifying online visual information
JP5911578B2 (en) Method for encoding feature point position information of image, computer program, and mobile device
US9349072B2 (en) Local feature based image compression
US20040220898A1 (en) Information processing apparatus, method, storage medium and program
JP2005235175A (en) Image feature set based on exif for content engine
US9600738B2 (en) Discriminative embedding of local color names for object retrieval and classification
CN107735783B (en) Method and apparatus for searching image
JP2001511930A (en) Image search system
US8666992B2 (en) Privacy preserving method for querying a remote public service
Vázquez et al. Using normalized compression distance for image similarity measurement: an experimental study
Meharban et al. A review on image retrieval techniques
AU2011254041A1 (en) Compression of sift vectors for image matching
Chen et al. Image retrieval based on quadtree classified vector quantization
Singh et al. Ensemble visual content based search and retrieval for natural scene images
JP2001319232A (en) Device and method for retrieving similar image
Suryawanshi Image Recognition: Detection of nearly duplicate images
Khwildi et al. A new indexing method of HDR images using color histograms
Yeh et al. Content-based image retrieval through compressed indices based on vector quantized images
Wu et al. Image indexing in DCT domain

Legal Events

Date Code Title Description
MK4 Application lapsed section 142(2)(d) - no continuation fee paid for the application