Relevance data for language models using maximum likelihood

David Bodoff, Bin Wu, K. Y.Michael Wong

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

We present a preliminary empirical test of a maximum likelihood approach to using relevance data for training information retrieval (IR) parameters. Similar to language models, our method uses explicitly hypothesized distributions for documents and queries, but we add to this an explicitly hypothesized distribution for relevance judgments. The method unifies document-oriented and query-oriented views. Performance is better than the Rocchio heuristic for document and/or query modification. The maximum likelihood methodology also motivates a heuristic estimate of the MLE optimization. The method can be used to test competing hypotheses regarding the processes of authors' term selection, searchers' term selection, and assessors' relevancy judgments.

Original languageEnglish (US)
Pages (from-to)1050-1061
Number of pages12
JournalJournal of the American Society for Information Science and Technology
Volume54
Issue number11
DOIs
StatePublished - Sep 2003
Externally publishedYes

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Human-Computer Interaction
  • Computer Networks and Communications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Relevance data for language models using maximum likelihood'. Together they form a unique fingerprint.

Cite this