Considerations for Participant Protections When Conducting Internet Research
If an activity falls under the category of human subjects research, it is regulated by the federal government and Teachers College (TC) Institutional Review Board (IRB). TC IRB has provided a guide to help researchers determine if their activities can be considered human subjects research.
Internet research is a common practice of using Internet information, especially free information on the World Wide Web or Internet-based resources (e.g., discussion forums, social media), in research. This guide will cover considerations pertaining to participant protections when conducting Internet research, including:
- Private versus public spaces for exempt research
- Identifiable data available in public databases
- Minimizing risks when using sensitive Internet data
- Common Internet research approaches
The following information is from an NIH videocast. (Odwanzy, L. (2014, May 8). Conducting Internet Research: Challenges and Strategies for IRBs [Video]. VideoCast NIH. https://videocast.nih.gov/summary.asp?Live=13932&bhcp=1)
Private Versus Public Spaces for Exempt Research
Federal regulations define a category of human subjects research that is exempt from IRB review as:
“Research that only includes interactions involving educational tests (cognitive, diagnostic, aptitude, achievement), survey procedures, interview procedures, or observation of public behavior (including visual or auditory recording).”
With regards to online information, if the data is publicly available (such as Census data or labor statistics), it is usually not considered human subjects research. However, if the data includes identifiable information—meaning the data can be linked back to a specific individual—then it may need to undergo IRB review. Additionally, de-identified data pulled from a private source, such as data provided by a company, may also be considered human subjects research.
Public behavior is any behavior that a subject would or could perform in public without special devices or interventions. Public behavior on the Internet, however, is more difficult to pinpoint. Federal regulations indicate that an environment may be private if a reasonable user would consider their interactions in that environment to be private. To help identify public behavior on the Internet, consider:
TC IRB will determine whether an Internet environment is private or public based on the IRB protocol submission.
Identifiable Data in Public Datasets
Identifiable data is information or records about a research participant that allows others to identify that person. Names, social security numbers, and bank account numbers are considered personal identifiers and are protected under the Health Insurance Portability and Accountability Act of 1996 (HIPAA). TC IRB has a blog posted on Understanding Identifiable Data that further explains the different types of identifiers. Data that includes personal identifiers does not fall under the Exempt category.
Other types of participant information may include indirect identifiers, such as birthdate, age, ethnicity, gender, etc. Taken alone, these pieces of information are not enough to identify any single participant. However, researchers have shown that certain combinations of these identifiers may identify participants. For example, Sweeny (2000) demonstrated that 87% of the United States population could be uniquely identified based solely on their ZIP code, gender, and date of birth.
It is important to remember that while data may be publicly available, it may still contain identifiable information. In these cases, the IRB will decide the risk to participants on a case-by-case basis. With Internet information, consider these to be possible identifiers:
Minimizing Risk When Using Sensitive Internet Data
In cases where sensitive Internet data must be used for research purposes, researchers should take precautions to ensure the safety and privacy of participants. The nature of online research increases risk to participants in some areas. Researchers should develop a plan to minimize risk in the following areas:
- Reduced Participant Contact: when research is conducted over the Internet, researchers have limited or no direct contact with subjects. This makes it more difficult for researchers to gauge subjects' reactions to the study interventions.
- Researchers should think through multiple possibilities for interventions, debriefing, and follow-up, if applicable.
- Researcher and TC IRB contact information should be presented on the informed consent before beginning the study. This will ensure that participants know whom to contact if they have questions or concerns.
- Breach of Confidentiality: when storing or collecting data on devices connected to the Internet, there is a heightened risk for identifiable participant data to be leaked.
- TC IRB has published a Data Security Plan outlining best practices for securing and transmitting data. Researchers should implement these practices as they apply to their specific study.
- In the case of a breach of confidentiality, researchers must file an adverse event with TC IRB.
Common Internet Research Approaches
The Secretary’s Advisory Committee on Human Research Protections (SACHRP) has provided examples of common Internet research practices. These include elements of research conducted over the Internet. Below are possible examples of Internet research where human subjects may be involved:
- Data Mining or Data Scraping: using Internet information that is readily available to the public. This type of information gathering typically does not involve direct interactions with participants.
- Existing datasets (secondary data analysis)
- Social media/blog posts
- Chat room interactions
- Online Subject Recruitment: using the Internet as a space for recruiting or interacting with participants.
- Qualtrics
- Amazon Mechanical Turk
- REDCap
- Social media
- Research on the Internet: directly studying the Internet and its effects.
- Patterns on social media or websites
- Evolution of privacy issues
- Spread of false information
- Research on Internet Users: directly studying Internet users and their behaviors.
- Online shopping patterns and personalized digital marketing
- Online interventions such as “nudging"
Increased Internet use for research requires researchers and IRBs to become familiar with Internet research-related topics and concerns. Research submitted to the IRB will be reviewed on a case-by-case basis. The Institutional Review Board at Teachers College will make the final determination of whether a study requires review. Researchers should email IRB@tc.edu if they have any questions or concerns about their study design and whether it should be IRB reviewed.