Supporting the intelligence analyst: Evaluating a model of semantics for automated text processing
This thesis is a study of automated text processing as it pertains to the document filtering task of intelligence analysts. In an effort to help analysts overcome the processing limitations imposed by human cognition, models of human semantics are being used as automatic text processing engines in emerging software products. In this study, human performance for making document similarity judgments is collected, and serves as a baseline against which a leading model of human semantics, Latent Semantic Analysis (LSA) is evaluated. Configured in the conventional way, LSA falls short in emulating human performance in the experiment reported here. Subsequent simulations point towards a two-stage process in which a criterion is used to initially assess content-based similarity. In the event that similarity falls below this criterion after stage one, a second similarity assessment is made after name-based information receives substantial amplification.