Reality Mining

Reality Mining: Using Big Data to Engineer a Better World

Nathan Eagle
Kate Greene
Copyright Date: 2014
Published by: MIT Press
Pages: 208
https://www.jstor.org/stable/j.ctt9qf8q3
  • Cite this Item
  • Book Info
    Reality Mining
    Book Description:

    Big Data is made up of lots of little data: numbers entered into cell phones, addresses entered into GPS devices, visits to websites, online purchases, ATM transactions, and any other activity that leaves a digital trail. Although the abuse of Big Data -- surveillance, spying, hacking -- has made headlines, it shouldn't overshadow the abundant positive applications of Big Data. InReality Mining, Nathan Eagle and Kate Greene cut through the hype and the headlines to explore the positive potential of Big Data, showing the ways in which the analysis of Big Data ("Reality Mining") can be used to improve human systems as varied as political polling and disease tracking, while considering user privacy.Eagle, a recognized expert in the field, and Greene, an experienced technology journalist, describe Reality Mining at five different levels: the individual, the neighborhood and organization, the city, the nation, and the world. For each level, they first offer a nontechnical explanation of data collection methods and then describe applications and systems that have been or could be built. These include a mobile app that helps smokers quit smoking; a workplace "knowledge system"; the use of GPS, Wi-Fi, and mobile phone data to manage and predict traffic flows; and the analysis of social media to track the spread of disease. Eagle and Greene argue that Big Data, used respectfully and responsibly, can help people live better, healthier, and happier lives.

    eISBN: 978-0-262-32456-4
    Subjects: Technology

Table of Contents

  1. Front Matter
    (pp. i-iv)
  2. Table of Contents
    (pp. v-vi)
  3. Introduction
    (pp. 1-6)

    Big Data is all the rage. Conferences, books, research papers, and entrepreneurial interest on the topic abound. And with good reason: the idea of mining meaning from a previously unfathomable amount of data to identify clear trends and even to predict the future is certainly enchanting. But as all the conferences, books, research papers, and business plans illustrate, figuring out how to grapple with data on this scale and make good use of them is no simple task.

    As we define it, Big Data is the collected bits of information produced from interactions that people and objects have with the...

  4. I The Individual (One Person)
    • 1 Mobile Phones, Sensors, and Lifelogging: Collecting Data from Individuals While Considering Privacy
      (pp. 9-30)

      Never before has it been easier to collect so much daily data about ourselves. Technologies that track our habits, our location, our purchases, our routines, our social interactions, and our sentiments abound, from mobile phones and downloadable software to galvanic skin monitors and wearable cameras. Indeed, the ease with which the “data exhaust” is emitted and can be captured in the wake of our daily behaviors presents researchers with new opportunities not only to gain insight into those behaviors, but also to use these insights to better design systems to reflect how people actually behave.

      Sensors, software, and their prevalence...

    • 2 Using Personal Data in a Privacy-Sensitive Way to Make a Person’s Life Easier and Healthier
      (pp. 31-50)

      Once a person’s various streams of data exhaust are collected, the question still remains: What can be done with them? This chapter provides a variety of answers to that question, including descriptions of specific projects with goals to make systems that help people lead healthier and more enjoyable lives. The projects discussed in this chapter are generally still in their early stages and just scratching the surface of data mining at the individual level.

      One exciting application is to use an individual’s data analysis to develop services that act as a personal coach, giving prompts to help a person change...

  5. II The Neighborhood and the Organization (10 to 1,000 People)
    • 3 Gathering Data from Small Heterogeneous Groups
      (pp. 53-68)

      As chapter 1 illustrated, there is an abundance of inexpensive commercial ways to log an individual person’s data. The methods are relatively straightforward, and a person can in some ways control how her data are being used and how that use benefits her. But the complexity of collecting data and of providing appropriate incentives for collection increase when personal data are gathered within small groups of people, even if those people share a common allegiance or goal.

      Whereas collecting personal data for personal analytics feels contained and private, collecting personal data to share even with a small group of people...

    • 4 Engineering and Policy: Building More Efficient Businesses, Enabling Hyperlocal Politics, Life Queries, and Opportunity Searches
      (pp. 69-82)

      Just as chapter 2 outlines the various ways a single person’s data can be harnessed to build systems and tools to help her achieve goals or live a healthier life, this chapter shows how small-group data can be used to improve individual productivity and health, facilitate more useful group interactions, and build healthier, more livable communities. With small-group data, engineers have the potential to easily identify meaningful and casual relationships between people, events, and their environment that can help workplace managers, citizen activists, health-care providers, and local government officials make better decisions.

      At this scale, social network alliances and hierarchies...

  6. III The City (1,000 to 1,000,000 People)
    • 5 Traffic Data, Crime Stats, and Closed-Circuit Cameras: Accumulating Urban Analytics
      (pp. 85-98)

      As of 2009, more than half of the world’s 7 billion people live in cities. In the United States, at least 82 percent of people are urbanites, whereas in the more rural India only 30 percent call a city home.¹ No matter the country, all cities generate a wealth of data that ultimately reflect the various behavior patterns of its diverse inhabitants. This chapter focuses on cities, defined here as 1,000 to about a million people and on two types of data that can in many ways define the rhythm of a city: traffic metrics and crime statistics.

      Specifically, we...

    • 6 Engineering and Policy: Optimizing Resource Allocation
      (pp. 99-108)

      As technologies enable more extensive analysis of data from the individual scale to the city scale, some people turn to large data sets as a fortune-teller might turn to a crystal ball. Ask the right question, and see the future materialize before your eyes. Even with Big Data and effective analysis, however, the picture of the future can still be murky. Still, the city scale offers an exciting opportunity: build fast-adapting systems for crime and traffic, and your predictions in these areas can be useful.

      By analyzing trends in crime data, police departments in Philadelphia, Memphis, and Los Angeles have...

  7. IV The Nation (1 Million to 100 Million People)
    • 7 Taking the Pulse of a Nation: Census, Mobile Phones, and Internet Giants
      (pp. 111-124)

      As Reality Mining scales up, national governments, large companies, and international organizations begin to play a crucial role in the collection, compilation, and dissemination of data. At this national scale, researchers and entrepreneurs can gain access to a wide range of data sources, including national censuses; call records; major Internet companies such as Google, Facebook, and Twitter; and, to a limited extent, banks. Of course, some of these data are more readily available than others.

      Census data are by far the easiest to acquire. Many nations make their census findings public via websites from which data can be downloaded and...

    • 8 Engineering and Policy: Addressing National Sentiment, Economic Deficits, and Disasters
      (pp. 125-140)

      National repositories of data are crucial for understanding how to allocate resources and design policy. But finding the best way to make sense of those data has always been the crucial question. Census, data. gov, and World Bank data all provide important but often static insights into nation-scale populations. These sources provide a snapshot in time and place, constrained by the logistics of traditional data-collection techniques. Through data visualization, such as timelines, maps, and graphs, policymakers and nongovernmental organizations can see the different ways these massive, dynamic data can be sliced, diced, and cross-correlated. The results can lead to more...

  8. V Reality Mining the World’s Data (100 Million to 7 Billion People)
    • 9 Gathering the World’s Data: Global Census, International Travel and Commerce, and Planetary-Scale Communication
      (pp. 143-152)

      One of the most profound applications of Reality Mining is tracking and predicting disease and epidemics. In our globally connected world, deadly diseases can spread at catastrophic speed to unprecedented numbers of people. This last part explores data at the global scale, with an eye toward the goal of better understanding how diseases propagate throughout our massively connected world. Thanks to data sets that tell us how people move, what they search for online, and how they feel, we have the opportunity to create an information-based concept of how the world works.

      Just as nations of the world coordinate population...

    • 10 Engineering a Safer and Healthier World
      (pp. 153-164)

      Today’s world has become wrapped in data, from flight networks to call data records and Web searches to Facebook status updates. At the global scale, there may be no more worthy application of Big Data than developing a systematic way to improve health worldwide. This final chapter focuses solely on approaches for using global data to identify and stop the spread of infectious diseases, ranging from influenza to malaria. We look at the ways data described in the previous chapter can inform models of particular disease spread worldwide.

      Disease travels via people, insects, and other vectors. In order to understand...

  9. Conclusion
    (pp. 165-168)

    In writing this book, we decided to take an approach that focused, when possible, on the start-ups or established companies that were practicing some sort of Reality Mining. Although academic papers and dissertations are interesting, they often describe fleeting research projects that may or may not lead to far-reaching initiatives or that have one-time results that can’t be verified. It’s certainly true that companies can flame out, be acquired, or simply fade away, but we believe that sharing examples of Reality Mining in the marketplace is a better way to ground Big-Data applications in a practical realm.

    That said, upon...

  10. Notes
    (pp. 169-190)
  11. Index
    (pp. 191-200)