Extracting information from big data: Issues of measurement, inference and linkage

Frauke Kreuter, Roger D. Peng

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Big data pose several interesting and new challenges to statisticians and others who want to extract information from data. As Groves pointedly commented, the era is “appropriately called Big Data as opposed to Big Information,” because there is a lot of work for analysts before information can be gained from “auxiliary traces of some process that is going on in the society.” The analytic challenges most often discussed are those related to three of the Vs that are used to characterize big data. The volume of truly massive data requires expansion of processing techniques that match modern hardware infrastructure, cloud computing with appropriate optimization mechanisms, and re-engineering of storage systems. The velocity of the data calls for algorithms that allow learning and updating on a continuous basis, and of course the computing infrastructure to do so. Finally, the variety of the data structures requires statistical methods that more easily allow for the combination of different data types collected at different levels, sometimes with a temporal and geographic structure. However, when it comes to privacy and confidentiality, the challenges of extracting (meaningful) information from big data are in our view similar to those associated with data of much smaller size, surveys being one example. For any statistician or quantitative working (social) scientist there are two main concerns when extracting information from data, which we summarize here as concerns about measurement and concerns about inference. Both of these aspects can be implicated by privacy and confidentiality concerns.

Original languageEnglish (US)
Title of host publicationPrivacy, Big Data, and the Public Good
Subtitle of host publicationFrameworks for Engagement
PublisherCambridge University Press
Pages257-275
Number of pages19
Volume9781107067356
ISBN (Electronic)9781107590205
ISBN (Print)9781107067356
DOIs
StatePublished - Jan 1 2013

ASJC Scopus subject areas

  • Mathematics(all)

Fingerprint Dive into the research topics of 'Extracting information from big data: Issues of measurement, inference and linkage'. Together they form a unique fingerprint.

  • Cite this

    Kreuter, F., & Peng, R. D. (2013). Extracting information from big data: Issues of measurement, inference and linkage. In Privacy, Big Data, and the Public Good: Frameworks for Engagement (Vol. 9781107067356, pp. 257-275). Cambridge University Press. https://doi.org/10.1017/CBO9781107590205.016