Chatbots For Social Change/IRB, Research Ethics

Institutional Review Boards (IRBs) have long acted as a much-needed regulatory arm of academic research. Although principles of IRBs may differ slightly amongst jurisdictions, any one IRB will provide a schema for planning ethical human research which gives a good guide for researchers inside or outside the institution. In this section we give a high-level overview of the IRB guidelines for Cornell University, as they apply to experimental interventions on people using Large Language Models.

If you are interested in publishing academic research which makes scientific claims based on the results of the intervention, most journals will require you to provide the approval you got from an institutional IRB board at a university. If you are not currently a member of a university or institution, your best bet is to find a collaborator or co-author who is willing to act as the contact, and submit to their IRB under their name.

Outline of IRB Considerations

Informed Consent: Participants must be fully informed about the nature of their interaction with the LLM, including potential risks, benefits, and the overall purpose of the research. The process must clearly distinguish the human participant's interaction with the AI from more traditional interventions. Special attention must be paid to ensuring participants understand that they are interacting with a machine rather than a person, and how their data might be used, stored, or processed by the AI system. If vulnerable populations are involved, the consent process may require further scrutiny and additional safeguards.

Data Collection and Retention: Data collection should be designed with clear protocols for safeguarding participant information. At the outset, the researcher must obtain informed consent, ensuring participants understand the type of data being collected, the purpose of the study, and how their information will be used and protected. Sensitive data, including personally identifiable information (PII), should be minimized to the greatest extent possible. If collecting and storing PII is necessary, the data collection process must involve robust encryption methods, such as AES-256 encryption, both at rest and in transit. This ensures that the data is secure during storage and transfer, preventing unauthorized access or breaches. Additionally, research teams should utilize secure data management platforms, with access restricted to only those individuals directly involved in the study.

To align with Cornell IRB standards, researchers must develop a comprehensive data retention and destruction policy. Data should only be retained for as long as is necessary to meet the objectives of the research. It is recommended to clearly outline a data retention period in the IRB submission, which includes specific timelines for data anonymization and deletion. For sensitive datasets, anonymization should involve techniques such as data masking, pseudonymization, or aggregation, which effectively reduce the risk of re-identification. Once the study is complete or the data is no longer needed, researchers must ensure that all data, particularly PII, is securely destroyed using approved methods, such as cryptographic erasure or physical destruction of storage media. Furthermore, if data is to be shared with third parties, strict data-sharing agreements should be established to ensure these entities adhere to the same confidentiality standards and that the data remains protected throughout its lifecycle. By employing these strategies, researchers can adequately protect participants' privacy and meet Cornell IRB's stringent data protection requirements.

Risk Assessment: As part of the IRB review, researchers are required to provide a thorough risk assessment, identifying any potential harms that may arise from the use of LLMs. This includes emotional distress, the possibility of biased responses from the AI system, or unintended social consequences resulting from the interaction. If the LLM is designed to have an influence on participants' decision-making, emotions, or social behavior, these risks must be carefully weighed. The IRB will also evaluate how the research team plans to monitor and mitigate such risks, including offering resources or referrals for participants who may need support after the intervention.

Impact of LLM on Decision-Making and Autonomy: Given the nature of LLMs to simulate human-like conversation, there is concern about how AI might influence a participant's autonomy. Cornell's IRB expects researchers to clarify how the LLM's responses are generated and to assess whether there is a risk of the chatbot’s recommendations or outputs being perceived as authoritative or manipulative. In fields where the research seeks to create social or behavioral changes, the ethical implications of using LLM-generated content to influence participants must be considered. Researchers should propose clear debriefing mechanisms to ensure participants understand the nature of the interaction post-experiment.

Bias and Fairness: Many LLMs may reflect inherent biases from the data they were trained on, potentially leading to socially harmful outcomes. Cornell’s IRB requires researchers to address how they will monitor for and mitigate bias in the LLM's responses, particularly if the intervention affects marginalized or vulnerable groups. This could involve regular auditing of the AI’s outputs for fairness, as well as transparency in how the AI has been trained. Any known limitations or biases within the LLM should be disclosed in the IRB application and communicated to participants.

Debriefing and Feedback: For research involving LLMs, especially where the social impact of the intervention is unclear or could have unforeseen consequences, a thorough debriefing process is necessary. The IRB will look for details about how participants will be informed about the true nature of the LLM interaction post-experiment and given the opportunity to ask questions or withdraw their data if they choose. Researchers are encouraged to include a mechanism for participants to provide feedback on their experience, which can also help in identifying any unanticipated risks or impacts.

Special Considerations for Social Impact Research: If the research aims to address societal issues or achieve a broader social impact, such as influencing public opinion, political views, or behaviors, Cornell’s IRB will evaluate whether the intervention could lead to unintended social disruptions. For example, if a chatbot is designed to engage with users on sensitive topics like mental health, political ideologies, or social justice, the IRB will require the researcher to provide detailed justifications for the choice of topic, population, and the ethical considerations of using an AI for such interventions.

The IRB Review Process

The IRB review process at Cornell University is a collaborative and iterative one, designed to ensure that research involving human subjects adheres to strict ethical standards. After researchers submit their initial proposal, which includes study objectives, methodologies, participant recruitment strategies, and data protection plans, the IRB typically engages in a back-and-forth process with the research team. This communication, often conducted over email, involves the IRB providing detailed feedback and requesting clarifications or modifications to ensure compliance with both institutional policies and federal regulations.

The feedback process can require multiple revisions, as the IRB might suggest adjustments to improve participant protections, refine the consent process, or better safeguard sensitive data. Researchers are expected to address these concerns and resubmit their revised protocols for further review. This ensures that the research is ethically sound before approval is granted.

Once approved, the IRB’s oversight doesn’t stop. For ongoing or multi-year studies, researchers must submit annual renewal applications to maintain their approval status. Any significant changes to the study design, methodology, or participant involvement during the course of the research also require prior IRB approval through an amendment process.

Exempt Status

At Cornell University, certain types of research involving human subjects may qualify for an IRB Exempt category, meaning they are subject to a lighter level of review. While these studies are still required to meet ethical standards, they typically involve minimal risk to participants and are eligible for a streamlined review process.

To qualify for exemption, the research must fall into one of several federally defined categories, such as studies involving normal educational practices, anonymous surveys, or research using publicly available data. However, even if a study meets these criteria, it must still be submitted to the IRB for an official determination of exempt status.

The exemption does not mean the study is free from oversight. Researchers are still required to follow guidelines related to informed consent, data privacy, and participant welfare. Additionally, any significant changes to the research after exemption is granted must be submitted to the IRB for review to confirm that the study remains eligible for exempt status. Although exempt studies do not require annual renewals, researchers must keep the IRB informed of any updates that could affect the scope or risk level of the research.