Sunday, August 17, 2008

The boundary between words and nonwords, the Borgmann project to list all words of English

I watched a lecture by Chris Cole on Google Tech Talks:

The Borgmann Project: Listing all the Words in English

Here is a critical blog post on it:

By eternally stressed semanticist Lance Nathan, at U Penn Linguistics.

I am more sympathetic to the project. I like his insight that the "list" will be a process that finds a stable solution from top down (generating rules to get probabilities) and bottom up (occurrences in a corpus).

No comments: