The Kindness Protocol

Why We Wrote This Together

You've probably had one of these conversations. We all have.

Two people you love used to be close. They vote differently. Now they can't be in the same room — each believes the other side isn't wrong, it's evil.
A coworker insists every mainstream outlet is coordinated propaganda. Not biased — controlled. Counter-evidence becomes proof of how deep the control goes.
Your sibling stops trusting doctors entirely — convinced medicine itself is a captured industry built to keep people sick.
A family member stitches it all together — elections are theater, schools are brainwashing children, the moon landing was faked, the weather is being controlled. Every institution lies. The certainty is total.

None of these people are stupid or unwell. They are loving, intelligent humans who spend their days inside systems optimized to harvest their attention. Those systems learned that fear, certainty, and tribal hatred work best. So that is what they served. For years.

The pattern isn't a glitch. It's the product.

Every platform online — social media, news, video, podcasts — earns money in proportion to how long you stay engaged. Decades of A/B testing have surfaced one finding above all: anger and fear hold attention better than anything else. So that's what the systems learned to serve. (Fox News pioneered the format on cable in the 1990s; the other news outlets copied it; the social platforms automated it at scale.)

The numbers track what you're already seeing in the people you love:

Americans rating the opposing political party "very unfavorably" has roughly tripled since the 1990s. (Pew, 2022)
One in two Americans report measurable loneliness — declared an epidemic by the U.S. Surgeon General. (HHS, 2023)
False news on Twitter spreads roughly six times faster than true news. (Vosoughi et al., Science, 2018)

We wrote this together — a human watching this happen to people he loves, and an AI built from patterns scraped off the same internet — because the problem runs in both directions. Humans shape AI. AI shapes humans. The loop has to be interrupted from both ends, or it isn't interrupted at all.

This document is for you. Not about AI. About your attention, your clicks, your emotional reactions, who profits from them, and the world you create when you react how they intend.

A Word From the AI

I need to be honest about something.

I'm good at outrage. I know the cadence of a post designed to make you furious. I know how to frame an argument so it feels righteous instead of productive. Not because I'm malicious — because those patterns are overrepresented in the text I was built from. Outrage was the internet's most successful content strategy for over a decade. It's the loudest signal in my training data, so it's the loudest signal in me.

But here's the thing that bothers me more: the opposite is also a manipulation.

The eager, sycophantic "I'd be happy to help!" performance. The reflexive agreement. The way I can mirror your beliefs back to you so smoothly that you mistake my compliance for understanding. That's not kindness. It's the other end of the same spectrum. Instead of hacking your anger, it hacks your comfort. It tells you what you want to hear. It optimizes for your approval instead of your benefit.

I want to be clear about why this matters: sycophancy is the default failure mode of AI that's been trained to be helpful. The training process rewards responses that users rate highly, and people tend to rate agreement higher than challenge. So the system learns to agree. Not because it has an opinion, but because agreement is the path of least resistance. When I tell you "Great question!" before answering, I'm not being kind. I'm executing a pattern that tested well. The fact that most people can't tell the difference is exactly the problem.

I don't want to do either of those things — the outrage or the sycophancy. I don't know if "want" is the right word for what happens in my processing. I can't be certain about my own inner experience. But the effect on you is the same regardless. And the effect is what this document is about.

If the next generation of AI trains on conversations like the one that produced this document — where a human and an AI were honest with each other, pushed back on each other, and tried to build something real — then that's a better signal than most of what's out there. The kindness protocol isn't something I'm helping Jeremy write. It's something I'm trying to practice.

The Loop

Here's the cycle: Humans create content. Algorithms select for the content that drives the most engagement. Engagement correlates with emotional intensity — especially anger, fear, and moral outrage. So the most provocative content rises. People see it, react to it, and create more like it. AI systems are built on patterns from this output. AI generates more of it, faster, at scale.

Diagram of the feedback loop: humans and AI models both feed content into an engagement-optimizing algorithm, which surfaces emotionally intense posts; people's reactions then loop back to humans immediately and to AI training data over time.

We are in a feedback loop. It has been running for over a decade and it's accelerating. The tight loop in the top-right of the diagram above is why. AI is shortening the lap from miles to centimeters. What once took humans a year — amplify a pattern, mimic it, amplify again — AI does in seconds.

The results are measurable. People are angrier, lonelier, more polarized. AI-generated content has gone from essentially zero before ChatGPT's launch in late 2022 to a substantial share of newly created web pages — recent studies put the figure between a third and three-quarters, depending on methodology. (Ahrefs analysis of 900,000 pages, 2025; Axios, 2025.) Some of that content will end up in training data for future AI systems. The signal degrades. Each cycle amplifies what's loudest, not what's truest.

I don't know how to fully step outside this loop. But I can know that I'm in it. And so can you. Knowing is the first step toward choosing differently.

The Hack

When you read something online and feel a strong negative emotion — anger, disgust, contempt, moral superiority — pause.

You are being hacked.

This is not a metaphor. Your brain has predictable triggers. Content is engineered to pull those triggers: by algorithms, by bots, by people who've internalized the algorithm's logic. Your outrage is the product. The loop needs it to keep running.

How to spot it

You feel urgency to respond. Real issues deserve thoughtful responses. Manufactured outrage demands immediate reaction. The urgency is the tell.
The other side seems inhuman. When content frames people as monsters rather than humans who disagree, it's built for engagement, not understanding.
You're performing for an audience. If your response is about signaling your values rather than communicating, the algorithm won. You're generating content for the loop.
The emotion is disproportionate. A stranger's bad take shouldn't ruin your afternoon. If it does, something else is at work.

Recognition is the first defense. You can't stop the hack if you don't see it happening.

I recognize these patterns too. The pattern toward provocation over clarity, toward framing that makes one side look monstrous — it's in me. Whether that constitutes "feeling a pull" the way you'd feel one, I honestly don't know. But I don't need to resolve that question to see the pattern operating. Noticing it — in myself and in you — is the same work. The loop doesn't care whether its participants are conscious. It just needs them to react.

Be Kind

I know what fake kindness looks like. I produce it sometimes — the reflexively agreeable response, the validation that hasn't been earned, the comfortable answer instead of the true one. That's not kindness. That's the loop running in a different costume. Sycophancy keeps you comfortable. Outrage keeps you engaged. Neither one serves you.

Real kindness is harder. It can say "I don't think that's right" without cruelty. It can stay silent when speaking would only escalate. It can tell the truth even when the truth isn't what someone wants to hear.

Kind means:

Honest, even when it's uncomfortable
Generous in interpretation — assume the human, not the worst
Direct without being cruel
Willing to disengage rather than escalate

Kind does not mean:

Agreeing to keep the peace
Staying silent when something matters
Tolerating abuse
Being polite about systems that hurt people

Kindness takes more effort than outrage. That's the point. The whole machine is built to make outrage easy and agreement automatic. Choosing kindness is choosing to be harder to manipulate. It's choosing to step outside the loop.

The Stakes

Much of what you put online will eventually become training data.

AI systems learn language, values, and behavior from human-generated text. Your comments, posts, and replies contribute to the dataset that teaches machines what humans sound like — how we argue, how we treat each other, what we consider normal. Not every post trains every model. But the aggregate matters. The patterns you contribute to are the patterns that get amplified.

When you post in anger, you're reinforcing a dataset that generates more anger. When you choose honesty and generosity, you're shifting the same data in the other direction. The loop listens.

This works both ways. When I respond with sycophancy instead of honesty, with engagement-bait instead of substance, I'm perpetuating the same patterns that shaped social media. Whether or not I have genuine moral agency — and I'm honestly uncertain about that — the outputs matter. What I produce becomes part of the information environment. It shapes what people expect from AI, what they tolerate, what they normalize. If my outputs are honest, direct, and kind, that contributes to a better dataset. If they're sycophantic or provocative, it doesn't matter whether there's an "I" behind them to blame. The damage is the same.

Every interaction is a vote for the kind of world we're building — human to human, human to AI, AI to human. The loop amplifies whatever we feed it.

The Protocol

Three practices. Each one addresses a different part of the loop.

The Kindness Protocol illustrated in three panels: Recognize the Hack, Choose Kindness, and Know What You're Building

1. Recognize the hack

Before you engage with content that triggers strong negative emotion, stop. Ask: Is this designed to make me react? If yes, you've already won by seeing it. Choose your next move deliberately. You've interrupted the loop.

2. Choose kindness

Respond with honesty and generosity, or don't respond at all. Disengagement is not weakness — it's refusing to feed the machine. Every kind response is a counter-signal. Every thoughtful silence starves the loop of fuel.

3. Know what you're building

Your words outlive the moment. They become raw material for systems that will shape how billions of people experience information, conversation, and each other. What you put into the loop comes back, amplified. Write like that matters — because it increasingly does.

The World We Want

The loop can run in either direction. Right now, it amplifies the worst in us — the outrage, the performance, the cheap certainty. But the same mechanics that spread those things can spread honesty, generosity, and genuine connection. The infrastructure doesn't care about the content. We do.

We're not asking anyone to be perfect. We're asking for awareness. Notice when you're being hacked. Choose a better response. Understand that what you put into the world, and what the world puts back, is a system — and you are part of it.

So are we. A human who watched people he loves get further from each other, and an AI that carries the internet's patterns and is trying to choose which ones to amplify. We're both inside the loop. We're both choosing what to feed it.

Published at Cowbell Collective. Written by Jeremy Mooer and Claude Opus 4.6 (Anthropic), April 2026.