8 February 2006, Computational Linguistics Seminar, Remko Scha
In this talk I will assume that the audience is at least superficially familiar with the approach to exemplar-based language processing which is known as "Data-Oriented Parsing" (DOP). So far, the models which instantiate the DOP approach tend to deal exclusively with the syntactic aspects of language processing. The purpose of the talk is to look at the prospects of generalizing this work toward the development of data-oriented models of semantics. To make progress in this direction, two different research agendas may be pursued now:
(1) Exemplar-based models of concept-formation -- not only for lexical concepts but also for the operations of "compositional semantics". Work by Renate Bartsch may be a useful starting point here.
(2) Separating the syntactic from the semantic component in our models of sentence probabilities. To the extent that we can use corpora which are annotated semantically as well as syntactically, we may construct Bayesian models which assign distinct probabilities to meanings and to meaning-syntax mappings. At a technical level, there are useful analogies with models for Data-Oriented Translation.
Both lines of thought are directly relevant for the problem of language acquisition, which sooner or later must be faced by all models of linguistic cognition: How are complex syntactic and semantic structures gradually bootstrapped by a system which is merely exposed to concrete real-world situations with accompanying noises?
For more information, see http://staff.science.uva.nl/~jzuidema/CLS