Text-to-Speech for Reading Your Texts — Natural Voices in English & Czech (2026)

CS · 17 ch

Chapter 12: Privacy, Licensing & Commercial Use

Chapter 13 of 17 · ~16 min read

Overview

Picking a TTS engine that sounds good is only half the job. Before you ship a feature that reads your users' (or your own) texts aloud, you have to answer two legal/operational questions that can quietly sink a project:

Licensing — can you legally use this in your product? Many of the best open models are released under permissive licenses (Apache-2.0, MIT) that allow commercial use without payment. But several headline models — XTTS, Tortoise, some voice-cloning tools — carry non-commercial or otherwise restricted terms, and at least one (Coqui's XTTS) is in a legal limbo because the company that owned it shut down. Using the wrong one in a commercial product is a real liability.
Cloud data handling — what does the provider do with the text and audio you send? For a Czech/EU user this is a GDPR question first and a "do they train on my data" question second. The good news: the major cloud TTS APIs are far more conservative with API data than their consumer chatbot cousins. The details (default retention windows, training opt-outs, EU data residency) differ enough to matter.

This chapter verifies each model's license individually, walks through voice-cloning consent and the EU AI Act, then builds a provider-by-provider data-handling table and recommends a privacy-safe default for an EU builder.

Disclaimer: This is engineering research, not legal advice. Licenses and privacy terms change; always read the actual LICENSE file and the provider's current DPA before you ship. Dates and figures below were verified in May 2026.

Content

Part 1 — Model Licenses: What's Actually Free for Commercial Use

The license taxonomy you need to know

For a product, three buckets matter:

Permissive (Apache-2.0, MIT, BSD): Use commercially, modify, redistribute, no fee, minimal obligations (keep the copyright notice). This is the green zone.
Copyleft (GPL, AGPL): You can use it commercially, but derivative works that you distribute must also be open-sourced under the same license. GPL is fine if you call the tool as a separate process (e.g. a CLI/Docker service over HTTP) but dangerous if you statically link or embed it into closed-source code. AGPL extends copyleft to network use — a real trap for SaaS.

Model	Code license	Weights license	Commercial OK?	Czech quality	Notes
Kokoro-82M	Apache-2.0	Apache-2.0	✅ Yes, cleanly	❌ No native Czech	Cleanest commercial story of any open model; no attribution required. Czech not in supported language list.
Piper (rhasspy/piper)	MIT (original)	per-voice (mostly permissive CC/MIT)	✅ Yes (check each voice)	⚠️ Yes, Czech voices exist; quality modest	The original `rhasspy/piper` is MIT.
Piper (OHF-Voice/piper1-gpl)	GPL	per-voice	⚠️ Yes, but GPL obligations	⚠️ same voices	The maintained fork moved code to GPL. Run it as a separate service to avoid copyleft contaminating your app.
MeloTTS	MIT	MIT	✅ Yes	❌ No Czech	EN/ES/FR/ZH/JA/KO only.
Chatterbox (Resemble AI)	MIT	MIT	✅ Yes	⚠️ Multilingual variant claims ~23 langs; verify Czech	Embeds Perth watermark by default (see below). MIT is genuine.
XTTS-v2 (Coqui)	MPL-2.0 (code)	CPML (non-commercial)	❌ No (legally murky)	✅ Strong Czech, voice cloning	Coqui shut down Jan 2024 — no one left to sell a commercial license. Avoid for commercial products.
Tortoise-TTS	Apache-2.0	Apache-2.0	✅ Yes	❌ English-focused, no real Czech	Permissive, but slow and English-centric.
Bark (Suno)	MIT	MIT	✅ Yes	⚠️ Multilingual incl. some Czech, inconsistent	Originally shipped with NC framing; later clarified to MIT. Quality/stability are the real limits, not the license.

Provider	Default retention of input text/audio	Used to train models?	EU region / residency	GDPR posture	Notes
Google Cloud TTS	Not stored by default; data logging is opt-in only	No (Cloud DPA; not used to train without permission)	Yes — EU regions + Cloud Data Processing Addendum	Strong; DPA + EU residency	Cleanest default of the big three: text isn't retained unless you opt into the (free-tier-discount) logging program.
Microsoft Azure AI Speech	Synchronous TTS: not retained after processing; Custom Neural Voice data stays in your resource region	No customer-data training without consent	Yes — West/North Europe etc.; region = data location	Strong; DPA, broad compliance certs, customer-managed keys	Custom Neural Voice is gated (responsible-AI approval) — a plus for misuse prevention.
Amazon Polly	May process content to "provide and improve" services unless you opt out	Yes by default for service improvement — opt out via AWS Organizations AI services opt-out policy	Yes — eu-west-1 (Ireland), eu-central-1 (Frankfurt)	Strong after you set the opt-out; DPA available	The only major where you must actively opt out to stop content being used for service improvement. Set the org policy before going live.
OpenAI API (tts/gpt-4o-audio)	Up to 30 days for abuse monitoring, then deleted; ZDR available for approved use	No training on API data by default	Limited EU data-residency options (improving)	DPA available; ZDR for eligible accounts	Note ongoing litigation-driven preservation orders may affect deletion timing — verify current status.
ElevenLabs	History stored in account by default; Zero Retention Mode for enterprise	No training on customer content for paid/enterprise by default; free tier differs	EU data residency for enterprise	DPA + GDPR program	Best-in-class quality, but EU residency + zero-retention are enterprise-tier features, not free-tier.

Text-to-Speech for Reading Your Texts — Natural Voices in English & Czech (2026)

Chapter 12: Privacy, Licensing & Commercial Use

Overview

Content

Part 1 — Model Licenses: What's Actually Free for Commercial Use

The license taxonomy you need to know

Model-by-model license verification

The Coqui XTTS trap (read this if XTTS tempts you)

The Piper GPL nuance

Watermarking: Chatterbox's Perth

Part 2 — Voice Cloning: Consent, Ethics & Legal Risk

A voice is personal — possibly biometric — data

The "right of publicity" / likeness

EU AI Act transparency for synthetic audio

A practical consent checklist for cloning features

Part 3 — Cloud API Data Handling & GDPR Posture

What "good" looks like for an EU builder

Provider-by-provider

Reading the table for your situation (EU, near-free, must be private)

Free-tier and consumer-vs-API caveats

Part 4 — A Privacy-Safe Default Recommendation

Key Takeaways

Key References