10 November 2000, Computational Logic Seminar

Speaker: Arjen de Vries, CWI
Title: Challenging Ubiquitous Inverted Files
Date and Time: November 10, 2000, 13.30-14.30
Location: Euclides Building, Plantage Muidergracht 24, Amsterdam, room P.327

Searching collections by their content (be it text, images, or true multimedia content) has become a common requirement in software systems, as a result of the successes achieved in the information retrieval (IR) research field, the demand for better search engines on the WWW, and the continuously growing amount of digitized photo footage presented online.

Retrieval systems enabling content-based search are typically standalone systems that have been developed for very specific applications. There is not much consensus on how the integration of these techniques in general-purpose database management systems (DBMSs) should take place. State-of-the-art solutions simply make new functions available in the query language. These functions interface to the otherwise still standalone software systems. This leaves to the user the burdens of both query formulation and the combination of results for each single representation into a final judgement. Also, this usually leads to inefficient query processing for queries involving several content representations.

This position paper discusses the architecture of the Mirror DBMS, especially designed to enable the integration of databases and (multimedia) information retrieval. We focus on a specific sub-problem of this challenging problem: the efficient processing of queries integrating text retrieval with structured document queries. An iconic example of an information need requiring such query processing is `English newspaper articles about Willem-Alexander dating Maxima'. `English' and `newspaper article' refer to attributes of a structered document collection, whereas the aboutness-clause is presumably best processed by information retrieval techniques.

Please note that this newsitem has been archived, and may contain outdated information or links.