Automatic retrieval and clustering of similar words
D. Lin. Proceedings of the 17th international conference on Computational linguistics, Seite 768--774. Morristown, NJ, USA, Association for Computational Linguistics, (1998)
Bootstrapping semantics from text is one of the greatest challenges in natural language learning. We first define a word similarity measure based on the distributional pattern of words. The similarity measure allows us to construct a thesaurus using a parsed corpus. We then present a new evaluation methodology for the automatically constructed thesaurus. The evaluation results show that the thesaurus is significantly closer to WordNet than Roget Thesaurus is.