ChatGPT, Claude, Gemini, Copilot, Perplexity – five names that almost everyone who works with a computer knows today. They all have a free version. And they all take something in return for that.

Let's look at what actually happens to your data when you type a query into these tools.

ChatGPT (OpenAI)

Your conversations are automatically used for training. Everything you type – text, code, company documents – can become material for future versions.

You can turn it off (Settings → Data Controls → "Improve the model for everyone"). But: first, most people don't know about it. And second, turning it off only applies going forward. What you wrote before is already in the system.

Even after opting out, OpenAI retains your conversations for 30 days. Delete a conversation? It also takes a month to actually disappear.

The bombshell: due to a lawsuit with The New York Times, OpenAI was required to preserve all user data for several months, including data users had deleted. The court lifted this in October 2025, but data from April to September 2025 still exists today. Deleted a conversation? It's sitting in a legal archive.

Claude (Anthropic)

For a long time the best of the five. Until summer 2025, Anthropic didn't use conversations for training, not even for the free version.

That changed in August 2025. Anthropic updated its terms and now uses conversations for training across Free, Pro, and Max tiers. And note: it's turned on by default. If you don't want it, you have to opt out yourself (Settings → Privacy → Model Training).

Those who consent (or don't notice and leave it on) have their data retained for up to 5 years. Those who decline stay on the standard 30 days. Deleted conversations aren't used for training.

Still better than most competitors – you have a real choice and deletion works. But the automatic opt-in means most people enabled training without knowing it.

Google Gemini

Conversations are automatically used for training. And actual people at Google can read them. This isn't speculation – Google says it openly.

Turn it off via "Gemini Apps Activity" in settings. After disabling, Google retains data for another 72 hours. Ever given feedback? That data stays for up to 3 years.

If you don't disable activity, Google retains conversations for 18 months by default (adjustable to 3 or 36 months).

Almost nobody knows this: using the Gemini panel in Gmail or Google Docs? When you type "summarize this thread" or "find the invoice from XY," those queries fall under the same rules. Your questions about your own emails can feed training if you haven't turned off activity.

Microsoft Copilot

Copilot is tricky because it looks like part of the corporate environment – it's in Windows, Edge, and Office. But the free version has no enterprise security. Microsoft says this directly.

An employee opens Edge, clicks the Copilot icon, enters company data – and has no protection. Yet they think they do, because it's "Microsoft."

For regular accounts, Microsoft plans to use data for training.

Perplexity

Queries, answers, and uploaded files – everything automatically feeds training. Turn it off: Profile → Settings → AI Data Usage.

Security incident: according to a late 2024 analysis, Perplexity stored uploaded files without encryption on publicly accessible storage. A PDF you uploaded could be found by anyone who knew the URL.

And because people naturally type specific things into a search engine like "what's the legal situation for my company?", every query is a piece of the puzzle about your business.

The big comparison

Trains on data? Humans read conversations? Opt-out? Better protection?
ChatGPT Yes (default) Possible Yes, manually Team / Enterprise
Claude Yes (since 8/2025, default) Possible (analysis) Yes, manually Team / Enterprise
Gemini Yes (default) Yes Yes, manually Workspace (corporate)
Copilot Planned Unclear Yes M365 Copilot
Perplexity Yes (default) Unclear Yes, manually Enterprise Pro

"Let's just buy the enterprise version and we're fine." Really?

Paid versions are incomparably more secure. But they're not bulletproof, and the reason is simple: third parties.

Your AI tool doesn't live in a vacuum. You connect it to other services – CRM systems, cloud storage, analytics tools, translation plugins. Every such connection is a place where data flows out. And that third party has its own rules, its own servers, its own people with access.

Think of it this way: you have a quality lock on the door. But you gave the key to one company, which then copied it to five other companies you don't know much about.

Real-world case: in November 2025, OpenAI was using analytics firm Mixpanel to measure how people work with their API. Attackers broke into Mixpanel and stole OpenAI customer data – names, emails, approximate locations. OpenAI responded: "our systems were not compromised." Which is technically true, but hardly comforting to the users whose data leaked.

You pay for the enterprise version with the best security. But the platform beneath it has dozens of other companies – for analytics, monitoring, support, infrastructure. Each one is a place where data can leak. And you don't even know about most of them.

What to do about it

If you're an individual

Go through each tool's settings and find the training toggle. It takes two minutes. And above all: don't enter anything into free versions that you'd mind seeing on the front page of a newspaper.

If you run a company

Do you know what your people are using? You might have a paid enterprise tool, but how many employees have meanwhile opened free ChatGPT and pasted in a client proposal?

Create policies. Define approved tools. And remember, even the paid version has vendors underneath that you don't know about. The weakest link in the chain isn't the tool you chose – it's the one you don't have visibility into.

Next up: what secure AI setup looks like in a company – from policies to tool selection to auditing.