Investigations in Logic, Language and Computation
Erik Aarts

Abstract:
%Nr: DS-1995-19
%Author: Erik Aarts
%Title: Investigations in Logic, Language and Computation

In this dissertation the time complexity of three problems is considered. 
The time complexity of a problem is the relation between the time a computer 
needs to solve a problem and the size of that problem. In this relation, we 
leave out constant factors. Suppose we have an object and an unsorted list, 
and that we want to know whether the object occurs in the list or not. The 
size of the problem is the length of the list. We call this length n. When 
we walk through the list and compare each object in the list with the object 
that we are looking for, the time needed is O(n) (order n) in the worst 
case. When the relation between the computing time and the size of the 
problem is a polynomial function (O(n k ) with k fixed), we say that the 
problem is in P (Polynomial time). 
Often a problem is hard to solve, but checking a solution is simple. Suppose 
we have a big number which is the product of two big prime numbers and that 
we are asked to find the two prime factors. A very simple algorithm is the 
following: try all smaller numbers as a divisor of the number we have to 
factor. If the number has n digits, we have to try O(10 n ) numbers as a 
prime factor. Apparently, this is not polynomial time, but exponential. By 
the way, there are much better algorithms, but these algorithms are not 
polynomial time either. Checking a solution can be done in polynomial time, 
however, if we have two numbers, we only have to multiply them in order to 
see whether the product is indeed the number we had to factor. The class of 
problems with the property that solutions can be checked easily, is called 
NP (Nondeterministic Polynomial time). Problems in NP that are conjectured 
to be not in P are called NP-complete. Factoring a number in prime numbers is 
NP-complete. 
The first part of this thesis describes the time complexity of problems in 
two fields: categorial grammar and acyclic context-sensitive grammar. 
Categorial grammars consist of a lexicon, in which types are assigned to 
lexical elements, and a sequent calculus in which we can reason about 
derivability of types. A type A is derivable from a sequence of types 130 
Samenvatting to know whether it is in the categorial language or not. It is 
not known whether this problem, when the standard Lambek calculus is used, 
is in P or NP-complete. This dissertation shows that for certain fragments the 
problem is in P. Those fragments are the grammars that use the non-associative 
Lambek calculus and the second order calculus (with second order we mean 
that the nesting depth is limited to two). 
Acyclic context-sensitive grammars are rewrite grammars that generate tree 
structures with crossing branches. We assign indices to context elements in 
the rewrite rule. We look at the problem whether a certain string is in the 
language generated. When we use normal context-sensitive grammars, this is a 
very hard problem, harder than NP-complete problems. Therefore we use acyclic 
grammars. We show that the problem, for these grammars, is in P if we vary 
the sentences only, and keep the grammar fixed. If the grammar may change as 
well, (and plays a role in the "size of the problem"), then the problem 
becomes NP-complete. 
The second part of this dissertation is about Prolog programs. Prolog is a 
very simple, powerful programming language. The big difference with other 
programming languages is that it is possible to code non-deterministic 
programs. But computers are deterministic machines, so the Prolog program 
must be converted to a deterministic program in some way. The standard way 
to do this is depth-first with backtracking. If a choice must be made, the 
first option is taken. When it turns out that this option does not lead to 
any result, the next option is tried, and so on. In this dissertation we 
consider an alternative search strategy (OLDT resolution). In this strategy, 
one memoes which alternatives are tried and what the result was. This 
strategy prevents that some subcomputation is done twice. In the standard 
search, the search space can be represented as a tree. It is very well 
possible that two nodes in the search space represent the same subproblem. 
In OLDT resolution, all nodes represent different subproblems by definition; 
the search space is not a tree anymore, but a graph. The idea that has been 
elaborated in this dissertation is that the size of the search space is an 
estimate for the time a Prolog program needs, because every branch in the 
search space is visited only once. In the second part of this dissertation a 
method is given to estimate the time a Prolog program needs under the 
assumption that OLDT resolution is used in the execution of the program. 
Finally we combine parts 1 and 2. We give Prolog programs, that compute 
whether some string is in a categorial language (for both fragments). Then 
we show that the time that those Prolog programs need, is polynomial in the 
length of the sentence and in the length of the lexicon.