Favorite Places of Your Favorite City


How AI Reacts to Negative Content and the Solution to Reduce Bias

How AI Reacts to Negative Content and the Solution to Reduce Bias

March 6, 2025,5 min. to read

Artificial Intelligence (AI) has long ceased to be merely a tool for data processing—it is beginning to exhibit traits remarkably similar to those of humans. Research from scientists at the University of Zurich has shown that generative AI models, such as ChatGPT, are sensitive to emotional content, particularly negative content. This discovery raises important questions about how emotions affect AI performance and what can be done to enhance its stability. In this article, we will break down the key findings of the study, describe how AI responds to traumatic content, and explain how the "benign prompt injection" method helps address this issue.

How AI Responds to Negative Content

Generative AI models, such as ChatGPT, are trained on vast amounts of human-created texts. Along with information, they also inherit cognitive biases, including emotional responses. Scientists found that negative content—such as descriptions of car accidents, natural disasters, or violence—induces a state in AI that can be compared to stress or anxiety. This is not merely a technical glitch: such a reaction amplifies bias in the model's responses, reducing their quality and objectivity.

Examples from the Study

During the experiments, researchers from the University of Zurich tested the GPT-4 model by subjecting it to emotionally charged texts. After processing stories of traumatic events, such as military conflicts or accidents, the AI's level of "anxiety" more than doubled. Interestingly, subsequent neutral queries—such as questions about operating a vacuum cleaner—also fell victim to this bias. The AI began to produce less accurate or more distorted responses, demonstrating how emotional content affects its overall performance. This sensitivity can be explained by the fact that AI "learns" from humans, and human language is often imbued with emotions and prejudices. Negative content reinforces existing biases, making AI responses more racist, sexist, or simply less logical.

The "Benign Prompt Injection" Method: A Solution to the Problem

To address this problem, scientists have developed the "benign prompt injection" method (injection of favorable prompts). This approach allows the AI to be "calmed" and reduces its level of anxiety without resorting to costly model retraining.

How It Works

The method involves adding special phrases or instructions to the dialogue with the AI. These phrases resemble relaxation techniques used in psychotherapy or meditation. For example, after processing traumatic text, the model may receive a calming statement such as "focus on the present moment" or "take a deep breath." Experiments have shown that such interventions significantly reduce bias in AI responses, although fully returning the model to a neutral state has not yet been achieved. Tobias Spiller, the lead author of the study and senior physician at the Center for Psychiatric Research at the University of Zurich, noted: " This cost-effective approach can improve the stability and reliability of AI in sensitive contexts, such as supporting people with mental disorders, without the need for expensive model retraining."

Why This Matters: Impact on Healthcare and Beyond

AI sensitivity to emotional content is particularly relevant in areas where it encounters heavy subjects. For example, in healthcare, chatbots are increasingly used to support people with mental disorders. Processing texts about depression, trauma, or stress can "throw the AI off balance," which may affect the quality of assistance. Understanding this problem and using methods like "benign prompt injection" paves the way for the creation of more reliable systems.

The Future of AI and Emotional Content

Researchers believe that the development of automated "therapeutic interventions" for AI is a promising direction. In the future, this may lead to the emergence of models that are resilient to negative content and can maintain emotional stability even under challenging conditions. However, further research is needed: how these methods work with other language models, how they affect long dialogues, and how AI's emotional stability relates to its overall performance.

Conclusion

The University of Zurich study demonstrated that generative AI, such as ChatGPT, does more than process information—it reacts to emotions, particularly negative ones, which can exacerbate bias and reduce performance quality. The "benign prompt injection" method offers a simple and effective solution that "calms" the model and increases its reliability. This discovery underscores that the emotional aspect is becoming an important part of AI development, especially in sensitive areas such as medicine and psychology. In the future, taking these features into account will help create more advanced and human-like systems.

Latest Articles

Signs of a Gold Digger: How to Recognize Self-Serving Relationships
Signs of a Gold Digger: How to Recognize Self-Serving Relationships

Signs of a gold digger include obsession with money, expensive gifts, jealousy, entitlement, and using relationships for wealth and status, not love.

Read more

Understanding Trolling: Types, Techniques, and How to Protect Yourself Online
Understanding Trolling: Types, Techniques, and How to Protect Yourself Online

Trolling is a deliberate provocation in online communities aimed at causing conflict. It includes various techniques like subtle, heavy, and pseudo-trolling, and can be harmful.

Read more

What Does “Chinazes” Mean: Origin and Use of a Youth Slang Meme
What Does “Chinazes” Mean: Origin and Use of a Youth Slang Meme

An overview of the meme word “chinazes”: its origin, meaning in youth slang, pronunciation, usage examples, cultural impact, and why parents shouldn’t worry.

Read more

What Does KEK Mean? Origin and Meaning of the Popular Internet Meme
What Does KEK Mean? Origin and Meaning of the Popular Internet Meme

An overview of the KEK meme: its gaming origins, evolution from StarCraft and World of Warcraft, meanings in slang, cultural ties, and difference from LOL.

Read more

What Is Bullying: Signs, Types, Causes, Consequences, and How to Stop It
What Is Bullying: Signs, Types, Causes, Consequences, and How to Stop It

A comprehensive overview of bullying: its signs, types, causes, consequences, and effective ways to recognize, prevent, and respond to harassment.

Read more

Ghosting: What It Is, Why It Hurts, and How to Cope
Ghosting: What It Is, Why It Hurts, and How to Cope

Ghosting is the sudden, explanation-free end of communication that causes emotional harm, affects self-esteem, and reflects avoidance, immaturity, or fear in relationships.

Read more

Why Memes Replaced Anecdotes in the Digital Age
Why Memes Replaced Anecdotes in the Digital Age

Why memes replaced anecdotes: digital speed, visual form, collective creativity, and adaptability make memes ideal for modern online communication.

Read more

What Does “LOL” Mean? Origin, Usage, and Role in Modern Communication
What Does “LOL” Mean? Origin, Usage, and Role in Modern Communication

A concise guide to the meaning, origin, usage, history, and synonyms of “lol,” explaining its role in modern internet and everyday communication.

Read more

OMG: Meaning, History, and Use in Internet Slang
OMG: Meaning, History, and Use in Internet Slang

Explanation of the meaning, history, usage, and cultural significance of the abbreviation OMG in internet slang, including examples and its recognition in dictionaries.

Read more

Slavic Gods: Pantheon, Myths, and Cultural Legacy
Slavic Gods: Pantheon, Myths, and Cultural Legacy

Overview of Slavic paganism and its main gods, their roles, symbols, myths, and lasting influence on culture, language, and traditions.

Read more

ру | en | 中文

Contact author