Please use this identifier to cite or link to this item:
http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/815
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kirishanthy, T. | - |
dc.contributor.author | Ramanan, A. | - |
dc.date.accessioned | 2016-01-04T07:18:19Z | - |
dc.date.accessioned | 2022-06-28T04:51:45Z | - |
dc.date.available | 2016-01-04T07:18:19Z | - |
dc.date.available | 2022-06-28T04:51:45Z | - |
dc.date.issued | 2015-11-24 | - |
dc.identifier.uri | http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/815 | - |
dc.description.abstract | In a patch-based object recognition system the key role of a visual vocabulary is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual vocabulary determines the quality of the vocabulary model, whereas the size of the vocabulary controls the complexity of the model. A compact visual vocabulary provides a lower-dimensional representation whereas a large-sized vocabulary may overfit to the distribution of visual words in an image and lead to heavy computational load. The generic framework of a bag-of-features approach follows a standard routine extracting local image descriptors and clustering with a user-designated number of clusters. The problem with this routine lies in that constructing a vocabulary for each single dataset is not efficient. Usually the construction of a vocabulary is achieved by cluster analysis using K-means algorithm. However, one of its drawbacks is the choice of a suitable value for K which determines the size of a visual vocabulary. The choice of the size of a vocabulary should be balanced between the recognition rate and computational needs. In this paper we propose a two-staged approach to map an initial high-dimensional vocabulary into a compact vocabulary while maintaining its discriminative power. Using an initial larger vocabulary we first represent the training images using a coding scheme that maps the importance of each visual word within an image as visual bits. These set of visual bits of images then form a sparse representation of every visual word with respect to the set of category-specific training images that is used for the compression. We have tested our vocabulary compression technique on four computer vision tasks: (i) Xerox7 (ii) PASCAL VOC Challenge 2007 (iii) UIUC texture and (iv) MPEG7 CE Shape-1 Part B Silhouette image datasets. Testing results show that the proposed method slightly outperforms vocabularies learnt by K-means by achieving just half the size of initial vocabulary. Our compression technique could help to optimize larger vocabularies to fewer visual words with stable performance. | en_US |
dc.language.iso | en | en_US |
dc.publisher | IEEE | en_US |
dc.subject | Bag-of-features; Compact Visual Vocabulary; Object recognition; Visual bit representation | en_US |
dc.title | Creating Compact and Discriminative Visual Vocabularies using Visual Bits | en_US |
dc.type | Article | en_US |
Appears in Collections: | Computer Science |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Visual Vocabularies Using Visual Bits.pdf | 177.14 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.