Spatial Reasoning: Theory and Practice
Marco Aiello
Abstract:
Spatial structures and spatial reasoning are essential to perception
and cognition. Much day-to-day practical information is about what
happens at certain spatial locations. Moreover, spatial
representation is a powerful source of geometric intuitions that
underly general cognitive tasks. How can we represent spatially
located entities and reason about them? To take a concrete domestic
example: when we are setting a table and place a spoon, what are the
basic spatial properties of this new item in relation to others, and
to the rest of the space? Not only, there are further basic aspects to
perception: we have the ability to compare different visual scenes,
and recognize objects across them, given enough `similarity'. More
concretely: which table settings are `the same'? This is another task
for which logic provides tools.
Constraining space within the bounds of a logical theory and using
related formal reasoning tools must be performed with particular care.
One cannot expect the move from space to formal theories of space to
be complete. Natural spatial phenomena will be left out of logical
theories of space, while non-natural spatial phenomena could try to
sneak in. Space in nature is one thing; space confined and restricted in the
bounds of a formal representation and reasoning system is quite
another thing. Connectivity, parthood, and coherence, should be
correctly handled and expressed by the formalism, not aiming at a
complete representation of space, but focusing on expressing the
most perspicuous spatial phenomena.
Our contribution with this thesis is twofold. On the one hand, we
investigate new and existing spatial formalisms with the explicit goal
of identifying languages nicely balancing expressive power and
tractability. On the other hand, we study the feasibility of
practical applications of such qualitative languages of space, by
investigating two symbolic approaches to pattern recognition.
The thesis is organized in seven technical chapters, plus an
introductory and a conclusions chapters, and three appendices. The
chapters from 2 to5 form the theoretical core of the dissertation, while
Chapters 6 and 7 are the practical component.
The first two chapters set the boundaries of our framework:
Chapter 2 from the expressive point of view, and Chapter 3 from the
axiomatization one. Then, we analyze two sorts of extensions of the
framework. Logical extensions are presented in Chapter 4, while geometrical
ones are introduced in Chapter 5.
In Chapter 2, we revive the topological interpretation
of modal logics, turning it into a general language of patterns in
space. In particular, we define a notion of bisimulation for
topological models that compares different visual scenes. We refine
the comparison by introducing Erenfeuch-Fraisse style games played on patterns
in space.
In Chapter 3, we investigate the topological
interpretation of modal logic in modern terms, using the notion of
bisimulation introduced in Chapter 2. We look at modal
logics with interesting topological content, presenting, amongst
others, a new proof of McKinsey and Tarski's theorem on completeness
of S4 with respect to the real line, and a completeness proof for the
logic of finite unions of convex sets of the reals.
In Chapter 4 we consider logical extensions to the
topological modal approach to space. The introduction of universal and
hybrid modalities is investigated with respect to the added logical
expressive power. A spatial version of the tense Since and Until logic
is also examined. A brief comparison with higher-order formalisms
gives a more general perspective on (extended) modal logics of space.
In Chapter 5, we proceed in the modal investigation
of space by moving to affine and metric
geometry, and vector algebra. This allows us to see new fine-structure
in spatial patterns suggesting analogies across these mathematical
theories in terms of modal, modal logic and conditional logics.
Expressive power is analyzed in terms of language design,
bisimulations, and correspondence phenomena. The result is both
unification across the areas visited, and the uncovering of
interesting new questions.
In Chapter 6, we take a different look at model
comparison games for the purpose of designing an image similarity
measure for image retrieval. Model comparison games can be used not
only to decide whether two specific models are equivalent or not, but
also to establish a measurement of difference among a whole class of
models. We show how this is possible in the case of the spatial modal
logic S4u. The approach results in a spatial similarity measure
based on topological model comparison games. We move towards practice
by giving an algorithm to effectively compute the similarity measure
for a class of topological models widely used in computer science
applications: polygons of the real plane. At the end of the chapter,
we briefly overview an implemented system based on the game-similarity
measure.
In Chapter 7, we use a propositional language of
qualitative rectangle relations to detect the reading order from
document images. To this end, we define the notion of a document
encoding rule and we analyze possible formalisms to express document
encoding rules such as LaTeX, SGML languages, and others. Document
encoding rules expressed in the propositional language of rectangles
are used to build a reading order detector for document images. In
order to achieve robustness and avoid brittleness when applying the
system to real life document images, the notion of a thick boundary
interpretation for a qualitative relation is introduced. The system is
tested on a collection of heterogeneous document images showing recall
rates up to 89%.
The presentation ends with three appendices. Appendix A is a brief recall
of basic topological notions, useful for reading Chapters 2, 3, and 4.
Appendix B presents an algorithm for sorting directed transitive cyclic
graphs in relation to the system presented in Chapter 7. Appendix C overviews
three implementations related to the thesis.