Relevance data for language models using maximum likelihood

David Bodoff, Bin Wu, K. Y.Michael Wong

Research output: Contribution to journalArticle

Abstract

We present a preliminary empirical test of a maximum likelihood approach to using relevance data for training information retrieval (IR) parameters. Similar to language models, our method uses explicitly hypothesized distributions for documents and queries, but we add to this an explicitly hypothesized distribution for relevance judgments. The method unifies document-oriented and query-oriented views. Performance is better than the Rocchio heuristic for document and/or query modification. The maximum likelihood methodology also motivates a heuristic estimate of the MLE optimization. The method can be used to test competing hypotheses regarding the processes of authors' term selection, searchers' term selection, and assessors' relevancy judgments.

Original languageEnglish (US)
Pages (from-to)1050-1061
Number of pages12
JournalJournal of the American Society for Information Science and Technology
Volume54
Issue number11
DOIs
Publication statusPublished - Sep 1 2003
Externally publishedYes

    Fingerprint

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Human-Computer Interaction
  • Computer Networks and Communications
  • Artificial Intelligence

Cite this