Investigations in Logic, Language and Computation Erik Aarts Abstract: %Nr: DS-1995-19 %Author: Erik Aarts %Title: Investigations in Logic, Language and Computation In this dissertation the time complexity of three problems is considered. The time complexity of a problem is the relation between the time a computer needs to solve a problem and the size of that problem. In this relation, we leave out constant factors. Suppose we have an object and an unsorted list, and that we want to know whether the object occurs in the list or not. The size of the problem is the length of the list. We call this length n. When we walk through the list and compare each object in the list with the object that we are looking for, the time needed is O(n) (order n) in the worst case. When the relation between the computing time and the size of the problem is a polynomial function (O(n k ) with k fixed), we say that the problem is in P (Polynomial time). Often a problem is hard to solve, but checking a solution is simple. Suppose we have a big number which is the product of two big prime numbers and that we are asked to find the two prime factors. A very simple algorithm is the following: try all smaller numbers as a divisor of the number we have to factor. If the number has n digits, we have to try O(10 n ) numbers as a prime factor. Apparently, this is not polynomial time, but exponential. By the way, there are much better algorithms, but these algorithms are not polynomial time either. Checking a solution can be done in polynomial time, however, if we have two numbers, we only have to multiply them in order to see whether the product is indeed the number we had to factor. The class of problems with the property that solutions can be checked easily, is called NP (Nondeterministic Polynomial time). Problems in NP that are conjectured to be not in P are called NP-complete. Factoring a number in prime numbers is NP-complete. The first part of this thesis describes the time complexity of problems in two fields: categorial grammar and acyclic context-sensitive grammar. Categorial grammars consist of a lexicon, in which types are assigned to lexical elements, and a sequent calculus in which we can reason about derivability of types. A type A is derivable from a sequence of types 130 Samenvatting to know whether it is in the categorial language or not. It is not known whether this problem, when the standard Lambek calculus is used, is in P or NP-complete. This dissertation shows that for certain fragments the problem is in P. Those fragments are the grammars that use the non-associative Lambek calculus and the second order calculus (with second order we mean that the nesting depth is limited to two). Acyclic context-sensitive grammars are rewrite grammars that generate tree structures with crossing branches. We assign indices to context elements in the rewrite rule. We look at the problem whether a certain string is in the language generated. When we use normal context-sensitive grammars, this is a very hard problem, harder than NP-complete problems. Therefore we use acyclic grammars. We show that the problem, for these grammars, is in P if we vary the sentences only, and keep the grammar fixed. If the grammar may change as well, (and plays a role in the "size of the problem"), then the problem becomes NP-complete. The second part of this dissertation is about Prolog programs. Prolog is a very simple, powerful programming language. The big difference with other programming languages is that it is possible to code non-deterministic programs. But computers are deterministic machines, so the Prolog program must be converted to a deterministic program in some way. The standard way to do this is depth-first with backtracking. If a choice must be made, the first option is taken. When it turns out that this option does not lead to any result, the next option is tried, and so on. In this dissertation we consider an alternative search strategy (OLDT resolution). In this strategy, one memoes which alternatives are tried and what the result was. This strategy prevents that some subcomputation is done twice. In the standard search, the search space can be represented as a tree. It is very well possible that two nodes in the search space represent the same subproblem. In OLDT resolution, all nodes represent different subproblems by definition; the search space is not a tree anymore, but a graph. The idea that has been elaborated in this dissertation is that the size of the search space is an estimate for the time a Prolog program needs, because every branch in the search space is visited only once. In the second part of this dissertation a method is given to estimate the time a Prolog program needs under the assumption that OLDT resolution is used in the execution of the program. Finally we combine parts 1 and 2. We give Prolog programs, that compute whether some string is in a categorial language (for both fragments). Then we show that the time that those Prolog programs need, is polynomial in the length of the sentence and in the length of the lexicon.