Quantitative analysis of literary styles

Roger D. Peng, Nicolas W. Hengartner

Research output: Contribution to journalArticlepeer-review

Abstract

Writers are often viewed as having an inherent style that can serve as a literary fingerprint. By quantifying relevant features related to literary style, one may hope to classify written works and even attribute authorship to newly discovered texts. Beyond its intrinsic interest, the study of literary styles presents the opportunity to introduce and motivate many standard multivariate statistical techniques. Today the statistical analysis of literary styles is made much simpler by the wealth of real data readily available from the Internet. This article presents an overview and brief history of the analysis of literary styles. In addition we use canonical discriminant analyis and principal component analysis to identify structure in the data and distinguish authorship.

Original languageEnglish (US)
Pages (from-to)175-185
Number of pages11
JournalAmerican Statistician
Volume56
Issue number3
DOIs
StatePublished - Aug 2002
Externally publishedYes

Keywords

  • Authorship
  • Canonical discriminant analysis
  • Data visualization
  • Function words
  • High-dimensional data
  • Principal component analysis

ASJC Scopus subject areas

  • Statistics and Probability
  • Mathematics(all)
  • Statistics, Probability and Uncertainty

Fingerprint Dive into the research topics of 'Quantitative analysis of literary styles'. Together they form a unique fingerprint.

Cite this