2012年3月14日 星期三

Aggregating local descriptors into a compact image representation

Aggregating local descriptors into a compact image representation,
Herve Jegou, et al.,
Proc. IEEE CVPR'10
================================================
The goal of this paper is to solve the problem of image search on a very large scale, where three constraints have to be considered jointly: the accuracy of the search, its efficiency, and the memory
usage of the representation.

There are three main steps:
1. the representation, i.e., how to aggregate local image descriptors into a vector representation;
2. the dimensionality reduction of these vectors;
3. the indexing algorithm.

1. the representation

The paper propose a descriptor, derived from both BOF and Fisher kernel, that aggregates SIFT descriptors and produces a compact representation, is was called "VLAD (vector of locally aggregated descriptors)".
The vector v is subsequently L2-normalized by v := v/||v||2 .


 Then we need to transform  an image vector into code. The coding has the property such that the nearest neighbors of a (non-encoded) query vector can be efficiently searched in a set of n encoded database vectors.

Using the ADC approach to approximate nearest neighbors



2. the dimensionality reduction

Dimensionality reduction is an important step in approximate nearest neighbor search, as it impacts the subsequent indexation method. Using the standard tool for dimensionality reduction, principal component analysis (PCA).



3. the indexing algorithm.

In the end, It focus on joint optimization of reduction/indexing, optimizing the dimension
D′, having fixed a constraint on the number of bits B used to represent the D-dimensional VLAD vector x, for instance B=128 (16 bytes).


 =============================================
Comments:

Pros:
1. The paper was clear and mentioned many related work for reference and comparison.
2. It combined and modified other methods and really result in good performance.

Cons:
1. It has mentioned about the Fisher kernel but I can't find the clear relation and details to the paper.

沒有留言:

張貼留言