Chapter 10: Browser & OS Built-in TTS — The Free Fallback

Chapter 11 of 17 · ~16 min read

Overview

Before you reach for a model, an API key, or a Docker container, your reader's device almost certainly already ships with a text-to-speech engine that costs nothing, requires no hosting, and produces zero bytes of network traffic in most configurations. The browser exposes it through the Web Speech API (SpeechSynthesis), and every desktop and mobile OS exposes its own native layer — macOS say / AVSpeechSynthesizer, Windows SAPI5 / OneCore, Android TextToSpeech, and iOS AVSpeech.

This chapter is the honest accounting of that "free fallback": where it shines, where it embarrasses you, and — critically for this report — how well it handles English and Czech. The headline tension is that the engine is free and local, but you do not control which voices exist on your reader's machine. That single fact governs everything below. We cover the JavaScript API in depth (with copy-pastable code that selects a Czech voice, handles the asynchronous voiceschanged quirk, and chunks long text), the per-OS quality reality, and when to drop this in versus when it will make your product look broken. We close with the server-side cousins of the same idea — macOS say and Piper — for when you want OS-grade TTS but on your machine, not the user's.

Content

What "built-in TTS" actually is

There are two distinct surfaces, and they're easy to conflate:

The Web Speech API (window.speechSynthesis) — a browser JavaScript interface. In the browser, it is almost always a thin shim over the host OS's TTS engine (or, in Chrome and Edge, sometimes over a Google/Microsoft cloud voice). You call speechSynthesis.speak(utterance) and audio comes out of the user's speakers. You never receive an audio buffer — there is no way to capture the output as a file from the standard API.
The OS-native TTS layer — say on macOS, AVSpeechSynthesizer (iOS/macOS), SAPI5 + OneCore on Windows, android.speech.tts.TextToSpeech on Android. These are what you'd call from a native app, or from a server process you control. is special because it can write to a file (), making it usable server-side.

Browser	Voice source	Czech voice present?
Chrome (desktop)	Google network voices (`localService=false`) + installed OS voices	Often yes via Google `cs-CZ` network voice; quality decent but requires network
Edge (desktop)	OS voices + "Online (Natural)" neural voices over the network	Yes — `Microsoft Antonín/Vlasta Online (Natural)`, genuinely good, but network + Edge-only
Firefox (desktop)	Only OS-installed voices	Only if the OS has a Czech voice installed
Safari (macOS/iOS)	Apple system voices	Yes — `Zuzana` (cs_CZ), more if user downloads "Enhanced/Premium"
Chrome/Firefox (Android)	Android system TTS (Google / Samsung engine)	Depends on installed engine + downloaded `cs-CZ` data

Platform	Czech voice exists?	Local quality	Best free option	Sends text to server?
macOS / iOS	Yes (Zuzana, +Premium download)	Good→Very good (Premium)	Zuzana Premium, local	No
Windows + Edge	Yes (Jakub local; Antonín/Vlasta online)	Local: poor; Online: very good	Edge "Online (Natural)"	Online voices: yes
Windows + Chrome/FF	Sometimes (registry quirk)	Poor	Limited	Chrome network voice: yes
Android	Usually (after download)	Decent (Google)	Google `cs-CZ`	Often yes
Linux	Rarely	Robotic (espeak)	Effectively none in-browser	No

Text-to-Speech for Reading Your Texts — Natural Voices in English & Czech (2026)

Chapter 10: Browser & OS Built-in TTS — The Free Fallback

Overview

Content

What "built-in TTS" actually is

Browser support in 2026 — the API is universal, the voices are not

The `localService` flag — local vs. network, and why it matters

The `voiceschanged` async trap (you will hit this)

Selecting a Czech voice (with graceful fallback)

Speaking, with rate / pitch / volume

The long-text bug — chunk by sentence

Quality reality, per OS — and Czech specifically

When the browser fallback is "good enough"

When it is NOT good enough (the limitations, plainly)

Server-side OS TTS — the same idea, on your machine

Decision shortcut

Key Takeaways

Key References

Text-to-Speech for Reading Your Texts — Natural Voices in English & Czech (2026)

Chapter 10: Browser & OS Built-in TTS — The Free Fallback

Overview

Content

What "built-in TTS" actually is

Browser support in 2026 — the API is universal, the voices are not

The localService flag — local vs. network, and why it matters

The voiceschanged async trap (you will hit this)

Selecting a Czech voice (with graceful fallback)

Speaking, with rate / pitch / volume

The long-text bug — chunk by sentence

Quality reality, per OS — and Czech specifically

When the browser fallback is "good enough"

When it is NOT good enough (the limitations, plainly)

Server-side OS TTS — the same idea, on your machine

Decision shortcut

Key Takeaways

Key References

The `localService` flag — local vs. network, and why it matters

The `voiceschanged` async trap (you will hit this)