Favorite Places of Your Favorite City


How AI Reacts to Negative Content and the Solution to Reduce Bias

How AI Reacts to Negative Content and the Solution to Reduce Bias

March 6, 2025,5 min. to read

Artificial Intelligence (AI) has long ceased to be merely a tool for data processing—it is beginning to exhibit traits remarkably similar to those of humans. Research from scientists at the University of Zurich has shown that generative AI models, such as ChatGPT, are sensitive to emotional content, particularly negative content. This discovery raises important questions about how emotions affect AI performance and what can be done to enhance its stability. In this article, we will break down the key findings of the study, describe how AI responds to traumatic content, and explain how the "benign prompt injection" method helps address this issue.

How AI Responds to Negative Content

Generative AI models, such as ChatGPT, are trained on vast amounts of human-created texts. Along with information, they also inherit cognitive biases, including emotional responses. Scientists found that negative content—such as descriptions of car accidents, natural disasters, or violence—induces a state in AI that can be compared to stress or anxiety. This is not merely a technical glitch: such a reaction amplifies bias in the model's responses, reducing their quality and objectivity.

Examples from the Study

During the experiments, researchers from the University of Zurich tested the GPT-4 model by subjecting it to emotionally charged texts. After processing stories of traumatic events, such as military conflicts or accidents, the AI's level of "anxiety" more than doubled. Interestingly, subsequent neutral queries—such as questions about operating a vacuum cleaner—also fell victim to this bias. The AI began to produce less accurate or more distorted responses, demonstrating how emotional content affects its overall performance. This sensitivity can be explained by the fact that AI "learns" from humans, and human language is often imbued with emotions and prejudices. Negative content reinforces existing biases, making AI responses more racist, sexist, or simply less logical.

The "Benign Prompt Injection" Method: A Solution to the Problem

To address this problem, scientists have developed the "benign prompt injection" method (injection of favorable prompts). This approach allows the AI to be "calmed" and reduces its level of anxiety without resorting to costly model retraining.

How It Works

The method involves adding special phrases or instructions to the dialogue with the AI. These phrases resemble relaxation techniques used in psychotherapy or meditation. For example, after processing traumatic text, the model may receive a calming statement such as "focus on the present moment" or "take a deep breath." Experiments have shown that such interventions significantly reduce bias in AI responses, although fully returning the model to a neutral state has not yet been achieved. Tobias Spiller, the lead author of the study and senior physician at the Center for Psychiatric Research at the University of Zurich, noted: " This cost-effective approach can improve the stability and reliability of AI in sensitive contexts, such as supporting people with mental disorders, without the need for expensive model retraining."

Why This Matters: Impact on Healthcare and Beyond

AI sensitivity to emotional content is particularly relevant in areas where it encounters heavy subjects. For example, in healthcare, chatbots are increasingly used to support people with mental disorders. Processing texts about depression, trauma, or stress can "throw the AI off balance," which may affect the quality of assistance. Understanding this problem and using methods like "benign prompt injection" paves the way for the creation of more reliable systems.

The Future of AI and Emotional Content

Researchers believe that the development of automated "therapeutic interventions" for AI is a promising direction. In the future, this may lead to the emergence of models that are resilient to negative content and can maintain emotional stability even under challenging conditions. However, further research is needed: how these methods work with other language models, how they affect long dialogues, and how AI's emotional stability relates to its overall performance.

Conclusion

The University of Zurich study demonstrated that generative AI, such as ChatGPT, does more than process information—it reacts to emotions, particularly negative ones, which can exacerbate bias and reduce performance quality. The "benign prompt injection" method offers a simple and effective solution that "calms" the model and increases its reliability. This discovery underscores that the emotional aspect is becoming an important part of AI development, especially in sensitive areas such as medicine and psychology. In the future, taking these features into account will help create more advanced and human-like systems.

Latest Articles

Chinese New Year 2026: The Year of the Red Fire Horse and Its Traditions
Chinese New Year 2026: The Year of the Red Fire Horse and Its Traditions

Chinese New Year 2026 begins on February 17 and ushers in the Year of the Red Fire Horse, symbolizing energy, change, traditions, and renewal.

Read more

Who Is an “Ank” and What Does This Slang Term Mean
Who Is an “Ank” and What Does This Slang Term Mean

A detailed explanation of the slang term “ank,” its origin, traits, cultural meaning, ironic use by zoomers, and its role in modern youth language.

Read more

What Is Tilt: Causes, Signs, and How to Regain Emotional Control
What Is Tilt: Causes, Signs, and How to Regain Emotional Control

Tilt is a state of emotional overload where rational thinking breaks down and impulsive actions take over, common in games, work, trading, and everyday life.

Read more

Phishing in 2026: Types, Risks, and How to Protect Your Data
Phishing in 2026: Types, Risks, and How to Protect Your Data

Phishing uses AI and fake sites to steal data. Learn types, signs, protection methods, and actions if targeted to stay safe online.

Read more

Cyberbullying: Types, Consequences, and How to Protect Yourself
Cyberbullying: Types, Consequences, and How to Protect Yourself

Cyberbullying is online psychological abuse with serious mental, social, and physical consequences; anyone can be a victim, and prevention requires awareness and action.

Read more

“KFC Boss”: Meaning, Origin, and Why This Slang Term Is Considered Offensive
“KFC Boss”: Meaning, Origin, and Why This Slang Term Is Considered Offensive

Explanation of the slang term “KFC Boss,” its meaning, origins, usage online, why it’s offensive, and how people respond to and protect themselves from it.

Read more

Signs of a Gold Digger: How to Recognize Self-Serving Relationships
Signs of a Gold Digger: How to Recognize Self-Serving Relationships

Signs of a gold digger include obsession with money, expensive gifts, jealousy, entitlement, and using relationships for wealth and status, not love.

Read more

Understanding Trolling: Types, Techniques, and How to Protect Yourself Online
Understanding Trolling: Types, Techniques, and How to Protect Yourself Online

Trolling is a deliberate provocation in online communities aimed at causing conflict. It includes various techniques like subtle, heavy, and pseudo-trolling, and can be harmful.

Read more

What Does “Chinazes” Mean: Origin and Use of a Youth Slang Meme
What Does “Chinazes” Mean: Origin and Use of a Youth Slang Meme

An overview of the meme word “chinazes”: its origin, meaning in youth slang, pronunciation, usage examples, cultural impact, and why parents shouldn’t worry.

Read more

What Does KEK Mean? Origin and Meaning of the Popular Internet Meme
What Does KEK Mean? Origin and Meaning of the Popular Internet Meme

An overview of the KEK meme: its gaming origins, evolution from StarCraft and World of Warcraft, meanings in slang, cultural ties, and difference from LOL.

Read more

ру | en | 中文

Contact author