Let me start with a specific scenario.

A mid-sized company's legal department receives a contract from a foreign partner for review. It's in English, 14 pages, full of legal jargon. Deadline is tomorrow. The in-house translator isn't available, an agency would take three days.

Someone opens ChatGPT, copies the entire text and types: "Translate to Czech, keep the formal style."

Two minutes later it's done. Everyone's happy.

Except that contract contained party names, financial terms, penalty clauses, and confidential business arrangements. And all of it just left the company's infrastructure.

What exactly happens when you send text to an AI / translator

When you type text into ChatGPT, DeepL, or Google Translate, that text travels across the internet to the provider's servers. This is necessary – the translation happens there, not on your computer.

The question isn't whether data leaves. It always does. The question is what happens to it afterward.

Google Translate

If you go to translate.google.com in a browser, your text falls under Google's general terms. And those clearly state that Google can use content you submit to improve products, including training AI models. Fast, free – and your text directly helps train the next version of Google.

DeepL

By default, DeepL processes data globally across servers in the EU, USA, and Japan. If you want to guarantee data stays in the EU, you need to pay extra for a Data Residency add-on, available only through direct sales.

The fine print: even customers with the paid Data Residency add-on, DeepL reserves the right to process operational components like logs and monitoring outside your selected region. They claim these don't contain translation content, but you can't verify that.

ChatGPT (OpenAI)

In the free version, OpenAI explicitly states conversations may be used for model training. In the paid version (Team), training is off by default. But the account administrator can manually turn it back on, and as a regular user, you have no way of knowing. No notification, no banner, no warning email.

You're translating a sensitive contract and don't know if OpenAI is currently using it to train the next model. If your colleagues use ChatGPT through a company card on the Team plan – check the settings. Now.

Then there's ChatGPT Enterprise, which doesn't use data for training at all. But most companies don't have Enterprise.

And we haven't even discussed where OpenAI's servers are located. ChatGPT runs exclusively on US servers with third parties involved in data processing.

GDPR and translators: a combination nobody's ready for

This is where "theoretical risk" turns into a concrete legal problem.

GDPR states that if you process personal data (names, emails, national ID numbers, addresses), you must know where it's processed, who has access, and on what legal basis.

When you send an HR document to Google Translate, Google effectively becomes a processor of personal data under GDPR. And as a company, you should have a Data Processing Agreement (DPA) in place with them.

Do you? Most companies don't. Most companies don't even know they should.

The fine for GDPR violations can reach 4% of annual turnover – not a number you want to ignore.

Three situations that keep repeating in practice

Situation 1: HR documents

Employment contracts, payslips, employee evaluations. These contain legally protected personal data. Translating them through a public tool without a DPA is a direct GDPR violation.

Situation 2: Contracts with NDA clauses

You signed a non-disclosure agreement with a client. Then you translate their internal documents through ChatGPT. You've shared content with a third party, albeit unintentionally – this could be a direct NDA violation.

Situation 3: Pricing proposals and internal reports

Not legally protected, but strategically sensitive. Competitive intelligence exists. And once data leaks, you can't take it back.

What companies get wrong (and how to fix it)

The biggest problem isn't that people are lazy. The problem is that companies lack a policy. Everyone handles translation ad hoc, with whatever tool they have at hand.

The solution isn't banning AI translation – nobody would follow that. The solution is having a clear framework:

Step 1: Categorize documents

Category Examples Rule
Public Marketing materials, web content No restrictions
Internal Reports, pricing proposals, strategy Approved tools only
Sensitive NDA contracts, HR documents, financial data Approved infrastructure only
Secret Trade secrets, patents, acquisitions Must not leave company infrastructure

Write it on one page and put it on the intranet. It doesn't have to be perfect – it has to exist.

Step 2: API isn't automatically secure

Many IT departments assume API access = secure access. It's not that simple. An API may have a DPA and better contractual terms, but data still leaves your infrastructure. And with poor implementation, data can leak without anyone noticing immediately.

Step 3: Audit the tools you're currently using

For every translation tool the company uses, find out: Where are the servers? Is there a DPA? What's the data retention policy?

Practical shortcut: DeepL doesn't offer a DPA by default for free and most paid users – you have to actively request one through the enterprise tier. If you don't know, find out, or stop using the tool for sensitive data.

Step 4: Look for tools with clear data policies

The ideal solution for companies that translate regularly and work with sensitive documents are tools running on dedicated infrastructure – either on-premises or with a provider that contractually guarantees data stays in the EU and goes nowhere else. You know exactly what happens to data, have auditability, and can demonstrate compliance during inspections.

One exercise for today

Open your chat history in ChatGPT or your translation history in DeepL. Scroll through the last ten items.

Would it be a problem if any of those texts got out?

If so, you now know what to address first.

If you're looking for a translation tool that runs on dedicated infrastructure in the EU, preserves document formatting, and lets you audit who translated what – take a look at Syntax Translate.