Here is a synopsis of my thesis. A short abstract is also available.

The Minimum Description Length Principle

and Reasoning under Uncertainty

Peter Grünwald

ILLC-Dissertation Series nr. DS-1998-03

Most research reported in the thesis concerns the so-called

The result of statistical analysis of a given set of data is nearly always a model for this data that is really a gross simplification of the process that actually underlies these data. Nevertheless, such overly simple models are often used with great success to classify and/or predict (aspects of) future data generated by the same process. For example, we use linear models for data which are not really linearly related; we assume that `errors' (discrepancy between actual data and assumed underlying functional relationship) are normally distributed whereas closer inspection reveals they are not; we assume data to be independent which are not etc. Yet such simplifying - and wrong - assumptions often lead to acceptable prediction, interpolation and extrapolation.

These are the central questions of the first part of the thesis. It turns out that they are closely related to the question whether the MDL Principle can be theoretically justified: in the presence of few data, the MDL Principle will often select a model for this data that in the end, when more data will have become available, turns out to be too simple. Can one show that this leads to acceptable (or even, in a sense, optimal) results nevertheless? Briefly, we reach the following conclusion: overly simple models can be applied to make predictions and decisions in two different ways: a `safe' one and a `risky' one. If a model is used in the `safe' way, then it will be `reliable' in the following sense: the model gives a correct impression of the prediction error that will be made if the model is used to predict future data, even in the case where the model is a gross simplification of the process that truely underlies the given data. If the model is used in the `risky' way, there is no such guarantee (nevertheless, such usage of a model often makes sense). We state and prove several theorems which show that incorrect models can be `reliable' in the sense indicated above under many circumstances. The concept of `reliability' is based on a non-standard interpretation of probabilities. This is the second main theme of the thesis:

It so happens that the notions of `description method' and `probability distribution' are very closely connected: every description method or