Macroanalysis: Digital Methods and Literary History

Copyright Date: 2013
DOI: 10.5406/j.ctt2jcc3m
Pages: 328
  • Cite this Item
  • Book Info
    Book Description:

    In this volume, Matthew L. Jockers introduces readers to large-scale literary computing and the revolutionary potential of macroanalysis--a new approach to the study of the literary record designed for probing the digital-textual world as it exists today, in digital form and in large quantities. Using computational analysis to retrieve key words, phrases, and linguistic patterns across thousands of texts in digital libraries, researchers can draw conclusions based on quantifiable evidence regarding how literary trends are employed over time, across periods, within regions, or within demographic groups, as well as how cultural, historical, and societal linkages may bind individual authors, texts, and genres into an aggregate literary culture. Moving beyond the limitations of literary interpretation based on the close-reading of individual works, Jockers describes how this new method of studying large collections of digital material can help us to better understand and contextualize the individual works within those collections.

    eISBN: 978-0-252-09476-7
    Subjects: Technology, Education, Language & Literature

Table of Contents

  1. Front Matter
    (pp. i-vi)
  2. Table of Contents
    (pp. vii-viii)
    (pp. ix-x)
      (pp. 3-4)

      An article in the June 23, 2008, issue of Wired declared in its headline “Data Deluge Makes the Scientific Method Obsolete” (Anderson 2008). By 2008 computers, with their capacity for number crunching and processing large-scale data sets, had revolutionized the way that scientific research gets done, so much so that the same article declared an end to theorizing in science. With so much data, we could just run the numbers and reach a conclusion. Now slowly and surely, the same elements that have had such an impact on the sciences are revolutionizing the way that research in the humanities gets...

    • 2 EVIDENCE
      (pp. 5-10)

      While still graduate students in the early 1990s, my wife and I invited some friends to share Thanksgiving dinner. One of the friends was, like my wife and me, a graduate student in English. The other, however, was an outsider, a graduate student from geology. The conversation that night ranged over a wine-fueled spectrum of topics, but as three of the four of us were English majors, things eventually came around to literature. There was controversy when we came to discuss the “critical enterprise” and what it means to engage in literary research. The very term research was discussed and...

      (pp. 11-23)

      As noted previously, there is a significant tradition of researchers employing computational approaches to the study of literature and an even longer tradition of scholars employing quantitative and statistical methods for the analysis of text. The specifically computational tradition dates back to the work of Father Roberto Busa, and since that time momentum has been building, exponentially, so that now, somewhat suddenly, the trend line has rocketed upward and the “digital humanities” have burst upon the scene and become a ubiquitous topic of discussion in humanities programs across the globe.* Notwithstanding the fact that there is no general agreement as...

      (pp. 24-32)

      The approach to the study of literature that I am calling “macroanalysis” is in some general ways akin to economics or, more specifically, to macroeconomics. Before the 1930s, before Keynes’s General Theory of Government, Interest, and Money in 1936, there was no defined field of “macroeconomics.” There was, however, neoclassical economics, or “microeconomics,” which studies the economic behavior of individual consumers and individual businesses. As such, microeconomics can be seen as analogous to our study of individual texts via “close readings.” Macroeconomics, however, is about the study of the entire economy. It tends toward enumeration and quantification and is in...

    • 5 METADATA
      (pp. 35-62)

      This chapter offers a first example of how the macroanalytic approach brings new knowledge to our understanding of literary history. This chapter also begins the larger exploration of influence that forms a unifying thread in this book. The evidence presented here is primarily quantitative; it was gathered from a large literary bibliography using ad hoc computational tools. To an extent, this chapter is about harvesting some of the lowest hanging fruit of literary history. Many decades before mass-digitization efforts, libraries were digitizing an important component of their collections in the form of online, electronic catalogs. These searchable bibliographies contain a...

    • 6 STYLE
      (pp. 63-104)

      In statistical or quantitative authorship attribution, a researcher attempts to classify a work of unknown or disputed authorship in order to assign it to a known author based on a training set of works of known authorship. Unlike more general document classification, in authorship attribution we do not want to classify documents based on shared or similar document content. Instead, the researcher performs classification based upon an author’s unique signal, or “style.” The working assumption of all such investigation is that writers have distinct and detectable stylistic habits, or “tics.” A consistent problem for authorship researchers, however, is the possibility...

      (pp. 105-117)

      The previous chapter demonstrated how stylistic signals could be derived from high-frequency features and how the usage, or nonusage, of those features was susceptible to influences that are external to what might we might call “authorial style,” external influences such as genre, time, and gender. These aspects of style were explored using a controlled corpus of 106 British texts where genre was a key point of analysis. The potential influences or entailments of nationality have not yet been examined. Clearly, nations have habits of style that can be identified and traced. Consider, for example, the British habit of dropping the...

    • 8 THEME
      (pp. 118-153)

      A typical complaint about computational stylistics is that such studies fail to investigate the aspects of writing that readers care most deeply about, namely, plot, character, and theme.* In the previous chapter, we saw how stylistic information can be usefully extracted from texts in a corpus and how the derivative data can be used to chart linguistic macro patterns and macro trends present in a century’s worth of novels. I also began to address the trickier business of theme through a discussion of a particularly “British” word cluster, a cluster that I suggested as a possible surrogate for an expression...

      (pp. 154-168)

      Examining macro patterns in style and theme allows us to contextualize our close readings in ways that have hitherto been impossible or, at the very minimum, impractical. We see, for example, that while Melville may be best remembered for Moby Dick, Moby Dick was only the apex text in a longer tradition of whaling- and seafaring-themed fiction, a tradition that stretches back at least to Sir Walter Scott’s book The Pirate (1821) and through the work of Frederick Marryat.* Along the way, from Scott to Marryat to Melville, other writers touch upon and help build the themes that ultimately find...

    • 10 ORPHANS
      (pp. 171-176)

      I began this book with a call to arms, an argument that what we have today in terms of literary and textual material and computational power represents a moment of revolution in the way we study the literary record. I suggested that our primary tool of excavation, close reading, is no longer satisfactory as a solitary method of literary inquiry. I argue throughout these chapters that large-scale text analysis, text mining, “macroanalysis,” offers an important and necessary way of contextualizing our study of individual works of literature. I hope I have made it clear that the macroanalysis I imagine is...

    (pp. 177-186)
  8. INDEX
    (pp. 187-192)
  9. Back Matter
    (pp. 193-199)