The moderator, Leonard Glantz, JD, of Boston University School of Medicine, introduced Part II by noting that the research community is still working to determine whether and how medical data differs from other kinds of research data, and that there are different views about what privacy refers to. It is important to be clear about what we mean by "privacy" when conceptualizing someone's interests in his or her data.
Alan Rubel, MA, JD, PhD, from the University of Wisconsin-Madison, outlined different ways to conceptualize the values that underlie privacy rights, including consequentialist conceptions (i.e., balance the aggregate good that can come from data sharing against individual harms from potential privacy loss), and autonomy (also eudaemonist) conceptions (i.e., people have some interest and ability to decide what matters to them and act according to those decisions), and he raised questions about whether it is possible to act autonomously in the realm of data sharing, given people's limited understanding of how their information is shared and used. He also discussed the issue of fairness of distribution, noting that, especially with respect to commercial sharing of data, the people who benefit are often not identical to the people who contributed their data.
Suzanne M. Rivera, PhD, from Case Western Reserve University, argued that "research exceptionalism" is problematic and unjustified because individuals increasingly share their private (and even health) information on social media and with commercial entities that are not directly tied to research, IRBs already evaluate the use of research data for potential risks to human subjects, "respect for persons is not the only Belmont principle," and subjects can be protected if researchers are held accountable for unauthorized behavior. She suggested we need to change our thinking from a focus on individual rights, to a focus on the common good.
Celia Fisher, PhD, from Fordham University, focused on concerns related to socially marginalized populations and group harms. Typically, decisions about risk and benefit have been made by IRBs, who share an assumption that scientific progress is a social good, but that perspective is not always shared by those with less power and influence who may be most affected by future health policies informed by big data research. Using examples, she questioned whether the evaluation of research risks and benefits sufficiently takes into account how scientific data can be used to sustain social stigmatization and discriminatory health policies, especially when much of the research is funded by private sector and government entities who have interests different from those of minority groups, and when the subjects involved and local IRB exert less influence over the downstream uses of the data. We need innovative approaches that foster public transparency and stakeholder participation in the risk-benefit analysis of how big data are used.
Michael Zimmer, PhD, from the University of Wisconsin-Milwaukee, presented three "provocations" to the group. First, he suggested that when we examine the use of large data sets for medical research through the lens of information ethics, we are better able to understand and potentially address widely held views about consent, anonymity, and privacy that seem irrational or inexplicable when we look through the lens of regulatory compliance. Second, he proposed that a conception of privacy as "contextual integrity" provides a useful way to analyze individuals' rights and expectations regarding who will have access to their information (e.g., people don't expect information they share on Facebook to become part of a research dataset, though they do expect it to be available to their friends). Third, he urged the group to recognize that data are not homogenous, and that there are various considerable complexities in thinking about the initial and secondary uses of different types of data.
Ifeoma Ajunwa, JD, PhD, from the Berkman Klein Center at Harvard University, proposed that we ask whether the term "human subject" should include not just someone who participates in a clinical trial, but also "customers" of a service such as 23andMe, which has removed genetic testing from the clinical setting and made it a direct-to-consumer service. Both 23andMe and workplace wellness programs are selling their customers' and employees' genetic information for research purposes, and she questions whether customers and employees understand that their data are being sold to undisclosed parties who may not always be acting for altruistic research purposes. She asked what protections are available to subjects in these new research spheres, and whether private companies should be allowed to determine who is a subject, what data may be shared and used, and how.
Mark Barnes, JD, from the Multi-Regional Clinical Trials Center of Brigham and Women's Hospital and Harvard University (MRCT Center), spoke about the new reality of data use, in which commercial firms are increasingly trying to buy both de-identified and identifiable forms of information, including a person's financial data, consumer data, biospecimens, and phenotypic data, and working to capitalize on the aggregation of these different types of data. He also noted a surprising provision in the new Common Rule, namely, that if someone revokes his/her broad consent for the future use of his/her identifiable data or biospecimens, institutions are permitted to deidentify that data or biospecimens and keep using it for research. This reflects a clear "value choice." He concluded by suggesting that this new era of data sharing should be accompanied by strong enforcement mechanisms targeted at parties who re-identify data or use data in ways that were not authorized.
Elizabeth Buchanan, PhD, from the University of Wisconsin-Stout, asked what values and principles underlie our decisions about how to share and use data. She believes we need a human rights framework for thinking about data sharing. When we look at who benefits and who stands to lose from data sharing efforts, the playing field is not fair. The issue of data sharing is often framed as one of individual rights versus societal benefit. If we come down on the side of societal benefit, we might ask whether individuals should have the right to opt out of pervasive data collection. But this is only reasonable to ask against a background of shared values and principles around data sharing, which we do not currently have. There are different types of data and different controllers of data. For instance, commercial entities like Facebook do not always make it known that they will sell or share data, or for what (commercial) purpose. Entities such as PatientsLikeMe, in contrast, are transparent about the fact that data will be shared, and are interested in data sharing for altruistic reasons. We need to be concerned about how we are educating the next generation of data scientists about the values and ethos of big data and data sharing.
Professor Glantz asked the panelists if we could mitigate the potential harms that arise from data sharing by keeping private information out of the hands of bad actors and those who seek to share data only for financial gain, and whether there is any harm in collecting de-identified private information and providing it to "good" actors who just want to use the data for altruistic purposes.
The following themes emerged in the course of discussion: The need to re-think and re-define harm and privacy violations in an era of data sharing; concerns about how economic benefits are distributed under emerging data sharing models; the importance of reviewing who decides what "public interest" means when data are shared, and the need to ensure that people understand the ways in which they are relinquishing traditional rights to privacy.
Dr. Rubel suggested that even when participants are not harmed by new data sharing and access initiatives and stand to benefit in general from scientific advances, they may not always see the same financial benefits as the organization who uses their data. This raises concerns about fair distribution of benefits. Workshop participants discussed how the Henrietta Lacks case brought questions about the fair distribution of economic benefits to the public's attention, and whether people would have the same concerns had her cells remained de-identified.
Dr. Zimmer urged the workshop participants to go beyond the traditional notions of harm such as economic consequences and look at the implications of data sharing for autonomy and an individual's ability to decide how his or her data are used even after de-identification measures have been put in place.
Dr. Ajunwa agreed that loss of autonomy is a relevant harm here, and raised the point that an individual's ability to control his or her health data is directly tied to his or her personhood. She noted that individuals are increasingly asked to give up their privacy rights in the name of "progress" or "innovation," when it is really in the service of surveillance; we should think beyond economic harms and think instead in terms of people's dignitary rights. Dr. Rivera argued that we should compare the risks and benefits of increased data access and sharing, and that in some situations, there may only be a small risk of theoretical dignitary harm and a great scientific reward. However, she also pointed out that trust in the scientific enterprise is important, and that there should be penalties for bad actors especially for individual investigators who misbehave. Mr. Barnes suggested that IRBs need a better conceptual framework to identify both economic and non-economic harms "discrete and insular minorities" might face, such as religious and dignitary harms.
Mr. Barnes also floated the "free rider problem," whereby if the majority of the public chooses to opt in to a system in which their data are shared, the minority who opt out benefits from advances in science that result, and faces no potential dignitary harms themselves. Dr. Rubel pointed out that even though individuals might choose not to participate in data sharing or research, they may support the scientific enterprise in other ways, including with their tax dollars.
Dr. Fisher suggested that we should not assume that even publically funded research is always in the interest of the public good. First she questioned who this "public" is. Second, we need more transparency on the part of funders, including the government, about the purposes of research they are supporting. Using personal biomedical data for future research on social issues (for example, the NIH Violence Initiative, which looked at potential biological causes of crime in urban environments) has the potential to lead to group harms such as stigmatization or problems getting insurance. There has to be more oversight with respect to future harms associated with use of our data that we can't anticipate now. Such concerns might be mitigated by public transparency and soliciting stakeholder input, as both measures influence what research the government chooses to fund.
Sharon Shriver, PhD, from PRIM&R, raised the concern that much of the discussion during the workshop assumes that research data can be fully de-identified. However, our ability to de-identify data (especially genetic data) might be constrained by future technological developments. Workshop participants discussed how the possibility of criminal penalties for re-identification of de-identified material might deter such practices in the future. Other participants raised the problem that the more data are de-identified, the less useful it is for research purposes.
Professor Glantz asked the workshop participants to reflect on the evolving expectations of privacy, and whether there is still such a thing as a "reasonable expectation of privacy," given our new data access and sharing landscape.
Dr. Rivera and others pointed out that conceptions of privacy change over time with new technology, and that our current system of research protections does not acknowledge the way people increasingly live their lives online and on social media where they are willing to accept reduced privacy for better online interactions.
Ms. Odwazny argued that the Common Rule's regulatory scheme does not encourage this kind of research exceptionalism because "the regulatory definition of minimal risk does allow for daily life risk standards." Furthermore, IRBs are often presented with big data research proposals that ought not to be regulated as human subjects research because the data being used is not private and identifiable. One possible solution, therefore, is IRB education of the type that PRIM&R provides, to help IRBs better understand when big data research is human subjects research, and when it poses more than minimal risks.
Dr. Ajunwa took issue with the idea of people "willingly trading their privacy," which had come up earlier, because it suggests deeper awareness of how privacy is being invaded than many people have. For example, some employers tout their workplace wellness programs as benefits to employees, but don't acknowledge that the programs are means to acquire employee's health data for un-disclosed third party commercial uses.
Dr. Zimmer raised the point that social media giants like Facebook socialize people to share more information online and revise their privacy expectations. However, just because the public is increasingly more likely to share their information via social media platforms does not necessarily mean their expectations of privacy have changed; we should look at the context within which people originally chose to share their information.