Learning a large number of visual object categories for content-based retrieval in image and video databases
ARRS project (J2-3607)
Project duration: 2010 - 2013
What do we want to do?
We are now witnessing a significant increase of digital image and video databases. To allow a human user to efficiently access the desired content, these images need to be semantically labeled. The classical low-level visual features at which the computers percieve the images are not directly linked to the high-level human visual interpretations, thus forming a semantic-gap. Our challenge is to develop a methodology that would bridge the gap between the computer-centered low-level image features and the high-level human-centered semantic meanings.
What is the starting point and where are we headed?
Within the EU project POETICON our group has developed a hierarchical object class model that is based on the intuitive principle of compositionality for the purpose of visual retrieval. Now we will focus on modeling and learning a larger number of visual object categories within a hierarchical compositional framework. Our approch will enable continuous learning of novel object categories through user interaction and autonomous indexing of object categories in image databases. We expect to shed new views of computer-user interaction in terms of continuous user-in-the-loop based semantic queries and queries at different levels of detail which retain their semantic meaning.
What will be the use of results?
The project is the first holistic proposal of using hierarchical categorical representations for learning, indexing and querying in visual databases. For this reason, it has a very high relevance and scientific excellence within the area of computer as well as artificial cognitive vision. We foresee an immediate application of the project’s results to the media and telecommunications industries as well as in the emerging area of cognitive robotics.