Item – Theses Canada

OCLC number
55885310
Link(s) to full text
LAC copy
LAC copy
Author
Raghuram, Rayan,1974.
Title
An experiment in intelligent text processing.
Degree
M. Sc. -- University of Windsor, 2001
Publisher
Ottawa : National Library of Canada = Bibliothèque nationale du Canada, [2004]
Description
2 microfiches.
Notes
Includes bibliographical references.
Abstract
Traditional approaches to information retrieval (IR) are statistical in nature. These approaches depend on data gathered from corpus analysis. As much as such approaches provide for immediately practical computational models in IR, they exhibit a maximal accuracy of 40% when applied to open-ended corpus. Language-based approaches, on the other hand, provide for more intuitive solutions to text retrieval with higher precision accuracy. The problem remains in that, these approaches require natural language understanding (NLU), and we do not have language understanding as of yet. It is commonly accepted that we require vast amounts of commonsense knowledge to attempt problems in NLU. It is the lack of such proper/complete knowledge structures to represent commonsense knowledge, that has left intelligent IR a little short of being abandoned. In this thesis, we emphasize that complete NLU is not necessary for intelligent IR, as we do not have to understand the text completely. It seems that discovering the 'aboutness' of text is sufficient to perform intelligent IR with precision accuracy that is far better than the traditional approaches. We prove this thesis existentially by implementing Digital Agora: An intelligent IR system that indexes/retrieves text based on subject content. Although we were successful in identifying that intelligent IR based on conceptual analysis is possible without complete NLU, we observed that the rather inefficient computational complexity that such an approach demands for, makes it impractical. Identifying this complexity bottle-neck to that of lexical disambiguation, we implement, test and present initial results of a computational model based on the Formal Ontology that attempts parallel marker-propagation at lexical disambiguation.
ISBN
0612840816
9780612840812