Researchers working with human subjects will often hear the phrase, “remove all identifiable data” or, “protect identifiable data with reliable security measures.” Identifiable data is vulnerable, as it includes information or records about the research participant that allows others to identify that person. If unauthorized individuals gain access to identifiable data, there could be a breach in confidentiality and privacy agreements. The Health Insurance Portability and Accountability Act of 1996 (HIPAA) protects 18 types of personal identifiers. For most human subjects research at Teachers College (TC), personal identifiers include the following:

  • Names
  • Social security numbers
  • Bank account information
  • Fingerprints
  • Telephone numbers
  • Home or email addresses
  • Medical record numbers
  • Codes that link de-identified data to identifiers (not stored separately from data)

Audio or video recordings of participants are also considered forms of personal identifiers and should be protected as such.

Data may also be considered identifiable if it is combined with enough information to potentially identify a participant. For example, indirect identifiers are instances where a researcher does not collect personal identifiers, such as names, but combines enough information that someone familiar with the participant’s background could potentially identify them. Indirect identifiers include:

  • Age
  • Ethnicity
  • Gender
  • City or state of residence
  • Occupation or role
  • Job function or title
  • Specific time, event, context, or occasion

While these identifiers alone may not be enough to deduce a participant from a study, a combination of them might make a participant identifiable. For example:

  • Demographic information and immigration status of ethnic minorities in a rural county
  • A study on workplace performance among individuals with depression recruited from a small organization
  • Graduates’ perspectives from a small high school coupled with their occupation

Data collection sources like Amazon’s MTurk are not completely anonymous as the workers’ IDs are linked and stored by Amazon.com. Researchers should clarify that they will not link any identifying information, including the workers’ IDs, to the data they obtain from MTurk. Please visit Amazon’s MTurk Privacy Policy for more information.

Direct or indirect identifiable data is subject to the following privacy and security measures:

  • Store datasets on TC approved systems, such as TC Google Drive
  • Transmit data with identifiers over TC provided Virtual Private Network (VPN)
  • Configure systems with approved anti-virus softwares provided by TC Information Technologies (TC IT)
  • Encrypt datasets containing identifiers

Researchers should also consult TC Information Technologies (TC IT) on ways to collect, store, transmit, and secure data with identifiers. Please review our Data Sharing, Requests, & Encryption guide for more information.

Any data that does not include identifiers (personal identifiers, indirect identifiers, audio recordings, video recordings) is considered anonymous. One way to gauge if data has no identifiers is if the researcher can determine the source of the data either through knowledge or inference. If the collected data is from an individual that the investigator cannot identify, even if pressed, the data collection can be considered anonymous.

There are instances when data will have identifiers, but can be stripped of identifiers and be stored, secured, maintained, analyzed as anonymized data. For example, a researcher can transcribe and code video recordings, securely removing all participant information, and then destroy the video with only the de-identified transcript as record of the data. A researcher may also receive identifiable data through a data sharing agreement (e.g., Data Sharing Form Template) and upon receiving the transmission, de-identify it.

In these cases, The IRB needs to know how the researcher will receive identifiable data and how they plan to de-identify it.

If a secondary research study involving human subjects does not qualify for an exemption (review our Submitting a Protocol for Existing Data for more information), the study must comply with the criteria for IRB approval of research at 45 CFR 46.111 (which includes the requirement for seeking informed consent from every prospective subject or legally authorized representative, unless informed consent is waived by the IRB). Under the revised Common Rule, there are options to conduct a secondary research study that involves human subjects but does not qualify for an exemption:

  • Apply for and obtain a waiver of the requirement for informed consent from the IRB
  • Seek and obtain the study-specific informed consent of each potential subject or legally authorized representative for the study in question

TC IRB will review each data type and source on a case-by-case basis and determine the review category that best fits the data collected.