Generating and browsing multiple taxonomies over a document collection

Scott Spangler, Jeffrey T. Kreulen, Justin Lessler

Research output: Contribution to journalArticle

Abstract

We present a novel system and methodology for generating and then browsing multiple taxonomies over a document collection. Taxonomies are generated using a broad set of capabilities, including meta data, key word queries, and automated clustering techniques that serve as a seed taxonomy. The taxonomy editor, eClassifier, provides powerful tools to visualize and edit each taxonomy to make it reflective of the desired theme. Cluster validation tools allow the editor to verify that documents received in the future can be automatically classified into each taxonomy with sufficiently high accuracy. In general, those seeking knowledge from a document collection may have only a vague notion of exactly what they are attempting to understand, and would like to explore related topics and concepts rather than simply being given a set of documents. For this purpose, we have developed MindMap, an interface utilizing multiple taxonomies and the ability to interact with a document collection.

Original languageEnglish (US)
Pages (from-to)191-212
Number of pages22
JournalJournal of Management Information Systems
Volume19
Issue number4
DOIs
StatePublished - Jan 1 2003
Externally publishedYes

Keywords

  • Data mining
  • Document classification
  • Document clustering techniques
  • Knowledge management
  • Navigation
  • Taxonomy
  • Text mining
  • Visualization

ASJC Scopus subject areas

  • Management Information Systems
  • Computer Science Applications
  • Management Science and Operations Research
  • Information Systems and Management

Fingerprint Dive into the research topics of 'Generating and browsing multiple taxonomies over a document collection'. Together they form a unique fingerprint.

  • Cite this