The evidence base of the social sciences is expanding, and the services provided by social science research centers need to change along with it. To facilitate this movement, we are forming a new consortium of these centers to share information about obtaining for their faculty and students troves of new, informative social and behavioral data from corporate, governmental, and other organizational sources. The challenges in obtaining these data are often far greater than the data collection efforts we have tackled individually, and the potential benefits for understanding and ameliorating the challenges that affect human society are much larger.
Not long ago, academics had access to almost all the research data in the world because we created it or easily obtained it from others. Today, the most informative data about people and societies is collected by private firms that offer no provisions for academic access. Social scientists have more data than ever before, which has led to spectacular progress in our field, but we also have access to a smaller fraction of the data in the world than ever before.
Although faculty and students obtain a great deal of data from academic organizations like ICPSR and Dataverse, they increasingly must fan out to the real world to negotiate with a broad array of commercial firms and others to obtain access to data. These negotiations are difficult, time consuming, risky, complicated, and sometimes infuriating. Convincing a company to work with you is already extremely difficult, but even when a researcher is over that hurdle they are left negotiating not with a corporation as a unitary actor; individual researchers or research centers then receive calls from separate teams of corporate attorneys, their comms team, the product team, a program manager, a technical project lead, and often many others. For one researcher, or even one supporting research center, merely answering the phone can be overwhelming. Unfortunately, even when all that can be conquered, it is still insufficient, since the same negotiation must then begin internally, within the researcher’s own university -- often involving extensive separate discussions with such bodies as Institutional Review Boards, Offices of General Counsel, Information Technology Offices, Information Security Offices, Public Affairs and Communications Offices, Offices of Sponsored Programs, and, to manage institutional risk, the Offices of the President and Provost, among others.
Moreover, each company holding private data may require a completely different model of data sharing, or a novel model that worked in a different industry, or a creative combination of two different models that are not publicly known for unrelated companies. Negotiating with companies and our own universities is usually easier if we can show them that their peers have adopted similar data sharing models, and so understanding the broad landscape of what has been done and what could be done is valuable. As with data sharing in general, information sharing by researcher centers in making data available from industry can be highly productive.
Social Science One was originally launched to pilot a specific model of industry-academic partnerships, seeking to share Facebook data with academics as a test case. We eventually succeeded in creating, and providing academic access to, the “Facebook URLs dataset” (containing data on the effects of social media on elections and democracy, and including more than 17 trillion cell values) and streaming APIs on Facebook’s page views and political ads library. We also recently helped launch an extensive set of experiments and observational studies now underway on the 2020 US election.
However, the road has been neither smooth nor fast, and we learned a remarkable amount in the process of overcoming the myriad regulatory, institutional, university, privacy, and other hurdles involved in negotiating this access to and approval for the use of this data. This knowledge pays considerable dividends for our other data collection efforts and for others who have come to us for advice. In fact, a second partnership we established with Microsoft, on building open source privacy protective software was far faster and easier to establish and is already bearing fruit. We have also shared information on industry-academic partnerships with a growing number of other academics who have made progress with an array of other companies. If we work together to share information on partnering with industry and others, we are confident that research centers will be able to better serve their faculty and students and to create considerable public good.