Review: “Big Data, Global Development, and Complex Social Systems”

Or How Mobile and Location-Based Data can help with socioeconomic development, and lower the cost of obtaining research data.

On November 23, 2010, Nathan Eagle, CEO of TxtEagle presented a lecture at the IBM Social Science Research Center, titled “Big Data, Global Development, and Complex Social Systems”. The topics were discussed in the context of both Mobile and Location-based data.  TxtEagle’s business model is built around using mobile phones to enable people in developing countries to earn money by performing a variety of job functions including market research (Synovate, Google,  Infogroup, Kantar are  among companies utilizing TxtEagle services).   An additional market research related finding was that demographic information that was derived from call records were found to be very similar to those obtained from customer surveys.  A similar presentation by Nathan may be found here. The talk covered several inter-related topics, and was a broad survey of topics (versus an in-depth coverage of a singular research project).

(1) mobile and location-based data
(2) analysis of large, data sets
(3) business model that uses mobile phones to bring work to developing countries.

The main themes covered were:

I. COMPLEX SOCIAL SYSTEMS: The complexity of ‘Big Data’ involves “continuous, weighted, large graphs, dynamics covariates, and outcomes”.   That is, traditional social science which historically has utilized multivariate regression analysis is insufficient for studying complex phenomena.  Nathan referred to a new field which envelopes these dimensions called Computational Social Science.


Nathan’s research questions have been focused upon social issues and global development. Additionally, his work is aimed at moving beyond descriptive statistics towards Engineering Social Systems in which prescriptive measures are taken to make social change.  Issues covered included:

  • Healthcare:  Swine Flu, Malaria, Smoking
  • Crime:  Philadelphia crime patterns
  • Economic Development: understanding differences, enabling change


The presentation was ordered in increasing size of data sets from N=150 to N=1,000,000,000 (whereas most market research surveys are in the few hundred to couple thousands range). Nathan’s case studies were primarily empirical in nature trying to identify cause and effect.  The case studies presented were a mix of both ongoing and completed research projects such that results range from to-be-determined, inconclusive, to those with strong correlations.  Some of the cases presented were:

  • N=150:  study of smokers which tracks their location via mobile phones to understand if there are particular social events which may affect the frequency of smoking (“Is there a behavioral signature associated with relapse?”)
  • N=100,000:  Philadelphia crime analysis – one hope was to see if there were ‘crime waves’; however none were seen.
  • N=10,000,000:  Spread of Malaria and movement of people.
  • N=100,000,000:  Cell phone data in the UK was aggregated and anonymized to study social and spatial diversity

Overall, a very interesting presentation with several thought-provoking ideas on utilizing mobile and location-based data to gain better insight into social phenomenon. Foremost, great to see folks working on these societal issues. Very noble. The epidemiology of disease is interesting in that it the data would seem to lag behind the spread (e.g. especially if there is a 20 day incubation period); though perhaps from several outbreaks a pattern will emerge. It is amazing, such as in the case of Swine Flu, that there is a vast data set among which researchers would looking for a pattern based on a relatively small group of initial people affected.

On a less catastrophic note, the above cell phone anonymization makes me wonder how much our cell phone data is being resold – would seem Traffic Apps could be obtaining there data from cell phone GPS locations over time. Is cell phone data being sold to local merchants? urban planners? real estate developers?

From a market research perspective, Computational Social Science is interesting in that it would seem applicable in many ways, such as with Customer Experience where there are many phenomena that might be better understood with the diffusion of ideas, or buzz, and then how might a company intervene to more favorable have their brand/product achieve a higher adoption rate.

Massachusetts could use a Jumpstart in the Internet Economy

New research note extends upon a recent “IT Industry” report by the UMass Donahue Institute

Early this year Governor Patrick and local industry professionals formed an IT Collaborative with a goal of improving the local IT sector.   On December 7th the Governor presided over a press conference on the key findings from an IT industry report.  Our government’s focus was primarily about job creation for which the report presented IT employment trends at the aggregate industry level.  The study showed Massachusetts jobs in software to be slowly increasing while hardware and network equipment jobs are on the decrease (a nationwide-trend).   The study can be found here.

The report has generated many questions, and for me a particular focal point is “How does the Massachusetts IT Industry compare to California?”.  The report provides some high level data (See Chapter 5, Appendix D).  However, I was interested in learning more, and so pulled data directly from the Bureau of Labor and Statistics.  This research extension was made possible by the well-structured approach detailed by the UMass Donahue Institute (in particular Appendix B).

The new results are not surprising in that they show California to be the leader in Internet Publishing, Broadcasting, and Web Search Portals (NAICS 519130).  A shock is Massachusetts flat-lining in this high growth industry sector (for category stats comparing MA and CA please also see PDF link IT-Industry).

Following the industry panel, an online audience participant remarked that Massachusetts’ relationship with California is similar to a Red Sox fan’s obsession with the Yankees!  At the moment, in the Internet Content and Search space, Massachusetts is like the Red Sox of the 1990’s; some successes though always looking up while the Yankees won the World Series.   This data only shines more light on this gap.   More answers are needed as to how things became this way,  and whether Massachusetts can jump on-board anytime soon!