Artificial Intelligence (AI) has long ceased to be merely a tool for data processing—it is beginning to exhibit traits remarkably similar to those of humans. Research from scientists at the University of Zurich has shown that generative AI models, such as ChatGPT, are sensitive to emotional content, particularly negative content. This discovery raises important questions about how emotions affect AI performance and what can be done to enhance its stability. In this article, we will break down the key findings of the study, describe how AI responds to traumatic content, and explain how the "benign prompt injection" method helps address this issue.
How AI Responds to Negative Content
Generative AI models, such as ChatGPT, are trained on vast amounts of human-created texts. Along with information, they also inherit cognitive biases, including emotional responses. Scientists found that negative content—such as descriptions of car accidents, natural disasters, or violence—induces a state in AI that can be compared to stress or anxiety. This is not merely a technical glitch: such a reaction amplifies bias in the model's responses, reducing their quality and objectivity.
Examples from the Study
During the experiments, researchers from the University of Zurich tested the GPT-4 model by subjecting it to emotionally charged texts. After processing stories of traumatic events, such as military conflicts or accidents, the AI's level of "anxiety" more than doubled. Interestingly, subsequent neutral queries—such as questions about operating a vacuum cleaner—also fell victim to this bias. The AI began to produce less accurate or more distorted responses, demonstrating how emotional content affects its overall performance.
This sensitivity can be explained by the fact that AI "learns" from humans, and human language is often imbued with emotions and prejudices. Negative content reinforces existing biases, making AI responses more racist, sexist, or simply less logical.
The "Benign Prompt Injection" Method: A Solution to the Problem
To address this problem, scientists have developed the "benign prompt injection" method (injection of favorable prompts). This approach allows the AI to be "calmed" and reduces its level of anxiety without resorting to costly model retraining.
How It Works
The method involves adding special phrases or instructions to the dialogue with the AI. These phrases resemble relaxation techniques used in psychotherapy or meditation. For example, after processing traumatic text, the model may receive a calming statement such as "focus on the present moment" or "take a deep breath." Experiments have shown that such interventions significantly reduce bias in AI responses, although fully returning the model to a neutral state has not yet been achieved.
Tobias Spiller, the lead author of the study and senior physician at the Center for Psychiatric Research at the University of Zurich, noted:
" This cost-effective approach can improve the stability and reliability of AI in sensitive contexts, such as supporting people with mental disorders, without the need for expensive model retraining."
Why This Matters: Impact on Healthcare and Beyond
AI sensitivity to emotional content is particularly relevant in areas where it encounters heavy subjects. For example, in healthcare, chatbots are increasingly used to support people with mental disorders. Processing texts about depression, trauma, or stress can "throw the AI off balance," which may affect the quality of assistance. Understanding this problem and using methods like "benign prompt injection" paves the way for the creation of more reliable systems.
The Future of AI and Emotional Content
Researchers believe that the development of automated "therapeutic interventions" for AI is a promising direction. In the future, this may lead to the emergence of models that are resilient to negative content and can maintain emotional stability even under challenging conditions. However, further research is needed: how these methods work with other language models, how they affect long dialogues, and how AI's emotional stability relates to its overall performance.
Conclusion
The University of Zurich study demonstrated that generative AI, such as ChatGPT, does more than process information—it reacts to emotions, particularly negative ones, which can exacerbate bias and reduce performance quality. The "benign prompt injection" method offers a simple and effective solution that "calms" the model and increases its reliability. This discovery underscores that the emotional aspect is becoming an important part of AI development, especially in sensitive areas such as medicine and psychology. In the future, taking these features into account will help create more advanced and human-like systems.