Skip to Navigation
University of Pittsburgh
Print This Page Print this pages

September 30, 2010

Technology calls into question research ethics concepts

The ability to conduct research online is forcing a new look at long-established research ethics concepts.

In the online world, who is a human subject? What is private? How do researchers protect subjects’ identities and guard against harm?

Researchers — and the institutional review boards that oversee their protocols — need to consider these emerging issues, said Internet research ethics expert Elizabeth Buchanan, a University of Wisconsin faculty member and director of the Internet Research Ethics Digital Library, Research Center and Commons.

Buchanan offered food for thought on identifying and handling online research dilemmas in a Sept. 24 lecture here, “Conducting Research on the Internet: Emerging Ethical, Regulatory and Practical Considerations,” hosted by Pitt’s Institutional Review Board (IRB) office.

Keeping up with new online environments, tools and technology is a huge challenge. “Everything changes in such a short period of time,” said Buchanan, who was co-principal investigator  on a 2005-06 project that surveyed hundreds of IRBs nationwide on their Internet research policies.

“What came out of that data was that everyone felt completely lost,” she said. “Everyone was struggling with the language, the tools, the technology.” The landscape is even more complex now. “We looked back on our survey that we did in 2005 and it looks absolutely amateur. At that point we weren’t thinking about Twitter, we weren’t thinking about cloud computing in the same way.”

The online environment has forced a radical redefinition of such basics as what is a human subject. Are avatars? Are turks? (Turks are people who, for a very small amount of money, complete a requested task in an online exchange using the Amazon Mechanical Turk marketplace.)

Ethically speaking, traditional definitions and guidelines sometimes don’t align online.

“What we need to recognize — and this is the hardest part for boards when they’re making decisions — is that the black and white, yes/no, dichotomous model of making decisions simply doesn’t fit, especially not in this web 2.0 and beyond environment.”

Harm to subjects

Some traditional ethical standards — such as seeking to do no harm to subjects — are accepted principles across disciplines. However, defining harm in the context of online research can be difficult because harm may not be as evident, Buchanan said. “We may not see the effects of our research. Harm may be downstream.”

Data may start out in one forum, but can be forwarded, reblogged, reposted or retweeted. “It changes the nature of the original context. Harm may not have been an issue in the original context, but in its subsequent uses, harm may very  well come. We have to think about potential uses and potential harms and risks.”

Vulnerability

Another agreed-upon ethical principle is that the greater the vulnerability of a research subject, the greater the obligation of the researcher to protect the subject. The ubiquitous nature of Internet data challenges the nature of who, what and when something is vulnerable, she said.

The rise of third-party data storage using Google Docs or other types of cloud computing forces a change. “Think about our stock language on protocols: We say, ‘We will keep this data for 10 years, in a locked file cabinet, in an undisclosed location …’ and for years that’s been the language that’s been used. What happens now when we can’t necessarily say with confidence that we know where our data is or for how long it’s going to be stored?”

In another aspect, the need for protecting subjects can take new forms.  For instance, a researcher wanted to study interactions on the online forum Gay Bombay, but at the time, the act of homosexuality was illegal in India.

The level of review had to change due to the risk of revealing illegal information about respondents. “The researcher had to be very careful about using pseudonyms of screen names, changing contexts of the forum, so that these things couldn’t be trackbackable,” Buchanan said.

Online data can be tracked indefinitely, she noted. “Think about the ease with which Google has been infiltrating our lives … all that data is out there. They have it and what are they going to do with it?”

Why online research?

Research integrity itself is a fundamental principle: Good methods and ethics equate to good research, Buchanan said. “Unfortunately, there’s a lot of crappy research going on.” Increasingly, online environments are used for convenience “where the justification for using that environment may not be strong enough.”

Researchers may find it expedient to use turks to get survey data quickly — paying 2 cents each to 500 turks online is a speedy way to get data, but is there a reason for using this population? she asked.

“I get nervous about research that’s being done online simply because you can do it online. Is there a justifiable reason? This comes back to research integrity.”

Researchers need to consider whether they are collecting good data by using online tools.

Verifiability of respondents can be difficult: What if one respondent answers the same survey under multiple personas? Likewise, how participants represent themselves can be questionable. “When do we know what we should believe online?” she asked. “How much posturing goes on in these online environments? What is good data from these online sites?”

Data banking

Researchers in medical disciplines are familiar with tissue banks or gene banks sharing data and materials — and IRBs have corresponding ways of requesting consent from research subjects. In others — the social sciences, for instance — “We take our field notes, we collect our data and then we keep it.”

The National Science Foundation recently announced that scientists seeking funding soon will be required to submit data management plans to foster more open data sharing. Buchanan said, “They’re going to want you to share your data. That’s part of the public good of research. … We’re not used to that model right now.”

Social science researchers will need to consider the various kinds of data that may need to be banked: textual, audio, video, data from Skype interviews — and how to obtain consent or re-consent for additional use of the data. Unlike a tissue sample, data banked online aren’t solely in the bank, Buchanan noted. “It’s also somewhere out on the web. We have to get used to thinking about shared sets of research. It’s a paradigm shift for us.”

An infrastructure will need to be in place for banked data, but Buchanan expressed concern about outsourcing of research and the potential for losing more and more control of data if entities such as AOL and Google someday charge researchers to access mass data sets. “Those are the things I hope don’t come to be.”

She urged that librarians and information architects be consulted as institutions prepare for the issues that will arise under the new NSF requirements.

Public & private forums online

Researchers may have a good understanding of what’s public and what’s private in the real world, but what about online spaces?

In a public park, people have no expectation of privacy, but what about an Internet chat room or other online space? The very nature of posting on Facebook or Twitter implies that by virtue of entering such spaces, users want to be known or seen, she said.

Even owners of private Twitter accounts (who must approve those followers who wish to receive their tweets) may find their communications flow into the public realm easily. For instance, if an approved follower with a public account retweets a message from a private account, it becomes widely visible.

Users of members-only sites may have other expectations of privacy. Contacting the moderator or owner of such sites to obtain consent may be a solution, Buchanan said.

The nature of the data can influence privacy considerations, so greater care needs to be exercised in the case of sensitive information.

Data ownership

Online survey tools such as Survey Monkey raise questions about who owns the collected data, and where and how they are stored.

Buchanan said some institutions are building or customizing their own tools to maintain control of their data. She advised those who use third parties to examine closely the terms of service agreements.

If the information collected is non-sensitive, it may pose little problem, but in the case of highly sensitive data the question of where they are kept and for how long can be an issue, she said.

The potential for hacking and lost data also are dangers that researchers need to consider. Again, Buchanan advised that information technology experts be consulted to identify potential risks.

The distance principle

Internet research complicates human subjects review, but the distance principle can help.

As the distance — be it emotional, psychological, physical or methodological — between a researcher and participant decreases, the research is more likely to be defined as involving human subjects. As distance increases, the opposite is true, Buchanan said.

For instance, in an interview conducted in the virtual world of Second Life, data are produced from an interaction in which there is little distance between researcher and participant. Although they are represented by avatars, the avatars correspond to individuals.

More and more often, avatars are being considered human subjects, Buchanan said. In contrast, using an automated bot program to collect data on web-surfing behavior, for instance, yields data collected far from the researcher and therefore is less likely to be considered human subject research.

Researchers bear the responsibility for ensuring the IRB understands the online aspects of their proposed project. Submissions could include a glossary of online environments and the type of data that could be collected from them.

A researcher also could submit screen shots of the online tools to be used, or could attend an IRB meeting to walk the committee through the venue.

IRBs also should consider some new questions in reviewing online protocols, Buchanan suggested.

Among them:

Does the researcher understand the venue or the tool?

Do the research subjects perceive their interaction as public or private?

Do subjects consider personal networks of connections to be sensitive information?

How will a subject’s profile, location or other personally identifying information be used or stored by the researcher?

If the content of a subject’s communication were to become known beyond the confines of the venue being studied, would harm likely result?

How do terms-of-service agreements articulate privacy of content? How is content shared with third parties?

How can the researcher ensure that participants understand and agree that their content or interaction may be used for research purposes?

Are the data easily searchable and retrievable?

Are the data subject to open data laws or regulations?

What third-party policies impact the research?

How long does the third-party provider or ISP preserve the data and where?

Can the researcher provide adequate information to participants concerning how the third party will protect their data?

How will researchers render anonymous email content or header information to protect subjects’ privacy?

Regardless of terms of service, what are the community or individual norms and/or expectations for privacy?

For additional information, the Internet Research Ethics (IRE) Digital Library, Research Center and Commons, online at www.Internetresearchethics.org, contains literature on Internet research ethics, a blog area and IRE presentations.

—Kimberly K. Barlow

Filed under: Feature,Volume 43 Issue 3

Leave a Reply