Logic Engineering. The Case of Description and Hybrid Logics
Carlos Areces

Abstract:
As the title indicates, there are two levels involved in the research
carried out in this thesis: the general issue of understanding (and
promoting) Logic Engineering, together with a detailed study of its
particular instantiation for Description and Hybrid Languages.

For some years now, a trend has been developing in the field of
computational logic: given the wide diversity of applications the
field has advanced into (theorem proving, software and hardware
verification, computational linguistics, knowledge representation,
etc.), a multiplicity of formal languages has been developed, offering
a wealth of alternatives to classical languages.  With the advantages
of the diversity of choice, comes its complexity. How do we decide
what the best formalism is for a given reasoning or modeling task?  Or
even more, what are the important rules to take into account when
designing yet another formal language?  How do we compare, how do we
measure, how do we test?  These are the questions that the young field
of Logic Engineering is supposed to investigate and, if possible,
answer.

What we know about Logic Engineering is still not a lot, and as yet
there are no general answers to these questions. Don't expect to find
a list of ``recipes'' of how things should be done here. But much can
be learned from analyzing in detail a particularly interesting case.
This will be the main thrust of the work carried out in the thesis.

Description logics are a family of formal languages used for
structured knowledge representation.  They have been designed as a
tool to describe information in terms of concepts and their
interrelation (definitions), together with means to specify that
certain elements of the domain actually fit such definitions
(assertions).  In addition, they provide a formal notion of inference
in terms of this structured knowledge.  Description logics constitute
the best example we are aware of, of a broad, homogeneous collection
of formal languages with a clearly specified semantics (in terms of
first-order models) devised to deal with particular applications.
They offer an assortment of specialized inference mechanisms to handle
tasks like knowledge classification, structuring, etc.  The complexity
of reasoning in the different languages of this family has been widely
investigated, theorem provers effectively deciding some of the most
expressive languages have been implemented (and they are among the
fastest provers for non-classical languages available), and these
languages have been successfully applied in many realistic problems,
even at an industrial level.  Connections between description
languages and modal logics have been investigated, but a unifying
logical background theory explaining their expressive power and
logical characteristics was largely missing.  This is the role to be
played by hybrid logics.

Hybrid languages are modal languages extended with the ability to
explicitly refer to elements in the domain of a model.  They were
first introduced in the mid 1960s, in the field of temporal logic, and
were subsequently developed mainly in a purely theoretical
environment.  The work in the field focused on investigating complete
axiomatizations for these languages, characterizing their meta-logical
properties and understanding their semantic and proof-theoretical
behavior.

Hybrid languages provide the exact kind of expressive power required
to match description languages.  Having been optimized for
applications, description logics are difficult to handle with
classical model- and proof-theoretical tools, but given the close
match between description and hybrid logics we will be able to apply
these techniques to the hybrid logic counterpart of description logics
instead.  Going in the other direction, description logics provide
hybrid logics with extensively tested examples of useful languages,
knowledge management lore, and implementations.  In this thesis we will
draw these two complementary fields together and investigate in detail
what each of them has to offer to the other. Given that the two areas
have developed different techniques and evolved in divergent
directions, ``trading'' between them will be especially
fruitful. Description logics can export reasoning methods, complexity
results and application opportunities; while hybrid logics have their
model-theoretical tools, axiomatizations and analyses of expressive
power to offer.

The particular aim of this thesis is, then, to explore and exploit the
connections between description and hybrid logic, their similarities
and differences.  The main results we will present specifically
concern this issue.  But we hope to take the first steps in
setting and discussing this work in the wider perspective of logic
engineering, and provide a small contribution to the general issue of
better understanding the rules behind the good design of new formal
languages.

The thesis is organized in four parts.  In the first, containing
Chapter 1, we discuss different ways of identifying
interesting fragments (and fragments of extensions) of first-order
logic.  We argue that traditional methods, like prenex normal form and
finite variable fragments, are not completely satisfactory. We
propose, instead, to capture relevant fragments _via translations_.
The semantics of many formal languages (including
modal, description and hybrid languages) can be given in terms of
classical logics, and as such they can be considered fragments of
classical languages.  But now, these fragments come together with an
extremely simple presentation --- modal languages, for example, are
usually introduced as extensions of propositional logic ---
and with novel and powerful proof- and model-theoretical tools (simple
tableaux systems, elegant axiomatizations, fine-grained notions of
equivalence between models, new model-theoretical constructions, etc).
Modal-like logics in general, and description and hybrid logics in
particular, will be presented as examples of useful fragments
identified in such a way.

Part II introduces both description and hybrid logics (in
Chapters 2 and 3 respectively) providing the
necessary background and the basic notions which will be used in the
rest of the thesis.  The chapters can be read independently and serve
as introductions to the kinds of methods and results which have been
developed in these areas.  They also provide a detailed guide to the
literature.  As we make clear in our presentation, description and
hybrid logics are closely related, and their connections are spelled
out in Chapter 4.  We start by presenting already known
embeddings of description languages into converse propositional
dynamic logics, and discussing why they provide a less satisfactory
match than the one obtained through hybrid languages.  In particular,
we highlight that two ingredients are needed for a successful
embedding: the ability to refer to elements in the domain of a model,
and the ability to make statements about the whole model from a local
point.  The first ingredient is needed to account for assertions, the
second to account for definitions. Both are provided, in an elegant
and direct way, by hybrid languages in the form of nominals, the
satisfiability operator and the existential modality.  We also clarify
the relation between local and global notions of consequence, the
first being the standard notion of consequence for hybrid (and in
general modal) languages while the second is predominant in the
description logic community.

After providing two-way satisfaction preserving translations between
description and hybrid logics, we explore the transfer of results.  We show how
the embedding into hybrid languages provides sharp upper and lower
complexity bounds,  separations in terms of expressive power and
characterizations, and meta-logical
properties like interpolation and Beth definability.
Concerning interpolation and Beth
definability, to the best of our knowledge this is the first time that
such results have been investigated in connection with description
languages.  Many of these results are obtained from the general
theorems we will prove in Part III.  We also discuss how results from
description logics can fill important gaps which have not yet received
attention in the hybrid logic community.  Some examples are the known
complexity bounds concerning description logics with counting
operators, or the PSpace results when certain syntactic restrictions
are imposed on the existential modality.


Part III of the thesis contains the core technical work.  In
Chapter 5 we show how ideas from description and hybrid
logics can be put to work with benefit even when the subject is purely
modal.  In particular, aided by the notions of nominal/individual, we
define well behaved direct resolution methods for modal languages.
This example shows how the additional flexibility provided by the
ability to name states can be used to greatly simplify reasoning
methods.  We proceed to build over the basic resolution method and
obtain extensions for description and hybrid languages.  In
Chapters 6 and 7 we take a hybrid logic
perspective as we dive into model-theoretical issues.  But we have
already demonstrated in Chapter 4 how hybrid logic
results shed their light on description languages.

In Chapter 6 we turn to expressive power.  We start by
considering $\Hls(@,\downarrow)$, a very expressive hybrid language.
The two main results concerning this language are
Theorems~\ref{the:charac} and~\ref{general-arrow}.  The first theorem
provides a five fold characterization of the first-order formulas
equivalent to the translation of a formula in $\Hls(@,\downarrow)$.
In particular, it identifies this fragment as the set of formulas
which are invariant for generated submodels.
Theorem~\ref{general-arrow} shows that the arrow interpolation
property not only holds in this language, but also for any system
obtained from $\Hls(@,\downarrow)$ by the addition of pure axioms.  In
a more general perspective, the results in Chapter 6 show
that $\Hls(@,\downarrow)$ is surprisingly well behaved in
model-theoretical terms.  As we discuss in this chapter, it can be
characterized in many different and natural ways, it responds with
ease to both modal and first-order techniques, and possess one of the
strongest versions of the interpolation and Beth properties we are
aware of for modal languages. For these reasons, $\Hls(@,\downarrow)$
can be used as a ``logical laboratory:'' what we learn from it using
the plethora of techniques it offers, can provide us, in many cases,
with intuitions on restrictions and extensions.  We see this process
in action throughout the chapter, as we are able to transfer certain
results from $\Hls(@,\downarrow)$ to extensions and sublanguages.

In Chapter 7 we discuss complexity.  We start with an
excursion into undecidability and we prove that a small fragment of
$\Hls(\downarrow)$ already has an undecidable local satisfiability
problem.  This is a hint that only very
severe restrictions on the $\downarrow$ binder will bring us back into
decidability.  We show in Theorem~\ref{the:hl-decnn} that if we
restrict ourselves to sentences of
$\Hls(\pmodop,\umodop,@,\downarrow)$, where $\downarrow$ appears
non-nested, decidability is regained.  In Chapter 4 we
have already shown that even this restricted use of binding proves
interesting from a description logic perspective.  We then turn to
weaker languages (without binders) which remain closer to standard
description languages.  In Theorem~\ref{the:b.k.pspace} we prove that
the addition of nominals and the satisfiability operator to the basic
modal language $\logic{K}$ does not modify its complexity, while it
greatly increases its expressive power.  Interestingly, the same is
not true when we extend the basic temporal language $\logic{K}_t$: the
addition of just one nominal increases the complexity of the local
satisfiability problem to \exptime, when the
class of all models is considered.  But usually temporal languages are
interpreted on models where the accessibility relation is forced to
adopt a ``time-like'' structure, the two best known cases being strict
linear orders (linear time) and transitive trees (branching time).  We
prove in Theorems~\ref{the:t.linear} and~\ref{the:t.trees.pspace} that
over these classes of models, complexity is tamed and again coincides
with the complexity of the basic temporal language.

Part IV contains our conclusions and directions for further research.
Here we highlight some of the lessons we have learned during the
research presented in this thesis.  As we said, we cannot hope yet for
general answers concerning logic engineering, but we can proceed by
analogy: the same questions we posed and answered for description and
hybrid logics can be tested on other formal languages, and we have
presented tools and methodologies (bisimulations, model construction
and comparison games, translations, etc.) which are powerful and
versatile enough to be useful in many diverse situations.