February 6, 2026

·

RunAnywhere Flutter SDK Part 3: Text-to-Speech with Piper

RunAnywhere Flutter SDK Part 3: Text-to-Speech with Piper
DEVELOPERS

Natural Voice Synthesis Entirely On-Device


This is Part 3 of our RunAnywhere Flutter SDK tutorial series:

  1. Chat with LLMs — Project setup and streaming text generation
  2. Speech-to-Text — Real-time transcription with Whisper
  3. Text-to-Speech (this post) — Natural voice synthesis with Piper
  4. Voice Pipeline — Full voice assistant with VAD

Text-to-speech brings your app to life. With RunAnywhere, you can synthesize natural-sounding speech using Piper—completely on-device, with no network latency, working identically on iOS and Android.

Like STT, TTS has an audio format consideration: Piper outputs raw Float32 PCM samples that need to be converted for playback.

Prerequisites

  • Complete Part 1 first to set up your project with the RunAnywhere SDK
  • ~65MB additional storage for the Piper voice model

Dependencies

Add the audio playback package to your pubspec.yaml:

yaml
1dependencies:
2 audioplayers: ^6.0.0

Then run:

bash
1flutter pub get

Register the TTS Voice

Add Piper to your model registration in your initialization code:

dart
1// Register TTS voice (Piper)
2RunAnywhere.registerModel(
3 id: 'vits-piper-en_US-lessac-medium',
4 name: 'Piper US English',
5 url: 'https://github.com/RunanywhereAI/sherpa-onnx/releases/download/runanywhere-models-v1/vits-piper-en_US-lessac-medium.tar.gz',
6 framework: InferenceFramework.onnx,
7 modality: ModelCategory.speechSynthesis,
8 artifactType: ArtifactType.tarGzArchive,
9 memoryRequirement: 65000000,
10);

Important: Piper Output Format

Piper outputs audio in a specific format:

ParameterValue
Sample Rate22,050 Hz
Channels1 (mono)
Format32-bit float (Float32) PCM

Most audio players can't play raw Float32 PCM directly—you need to convert to a playable format or use a specialized player.

Loading and Using TTS

dart
1// Download the voice (one-time, ~65MB)
2final isDownloaded = await RunAnywhere.isModelDownloaded('vits-piper-en_US-lessac-medium');
3
4if (!isDownloaded) {
5 await for (final progress in RunAnywhere.downloadModel('vits-piper-en_US-lessac-medium')) {
6 debugPrint('Download: ${(progress.progress * 100).toStringAsFixed(1)}%');
7 if (progress.stage == DownloadStage.completed) break;
8 }
9}
10
11// Load TTS voice into memory
12await RunAnywhere.loadTTSVoice('vits-piper-en_US-lessac-medium');
13
14// Synthesize speech
15final result = await RunAnywhere.synthesize(
16 'Hello, world!',
17 rate: 1.0,
18 pitch: 1.0,
19 volume: 1.0,
20);
21
22// result.samples is Float32List at 22kHz
23// result.sampleRate is 22050
24// result.duration is the audio length in seconds

API Pattern: Like loadSTTModel(), the SDK uses loadTTSVoice() for speech synthesis models. LLM, STT, and TTS each have dedicated load/unload methods because they use different runtimes and memory pools. You can have all three loaded simultaneously.

Audio Playback Service

Create lib/services/audio_playback_service.dart:

dart
1import 'dart:typed_data';
2import 'package:flutter/foundation.dart';
3import 'package:audioplayers/audioplayers.dart';
4import 'package:path_provider/path_provider.dart';
5import 'dart:io';
6
7class AudioPlaybackService {
8 final AudioPlayer _player = AudioPlayer();
9
10 /// Convert Float32 samples to WAV file and play
11 Future<void> playFloat32Audio(Float32List samples, int sampleRate) async {
12 // Convert Float32 to Int16
13 final int16Data = _convertFloat32ToInt16(samples);
14
15 // Create WAV file
16 final wavData = _createWavFile(int16Data, sampleRate);
17
18 // Save to temp file
19 final directory = await getTemporaryDirectory();
20 final wavPath = '${directory.path}/tts_output_${DateTime.now().millisecondsSinceEpoch}.wav';
21 final file = File(wavPath);
22 await file.writeAsBytes(wavData);
23
24 debugPrint('Audio saved to: $wavPath (${wavData.length} bytes)');
25
26 // Play the WAV file
27 await _player.play(DeviceFileSource(wavPath));
28
29 // Wait for playback to complete
30 await _player.onPlayerComplete.first;
31
32 // Clean up temp file
33 try {
34 await file.delete();
35 } catch (_) {}
36 }
37
38 /// Stop current playback
39 Future<void> stop() async {
40 await _player.stop();
41 }
42
43 /// Convert Float32 samples to Int16
44 Uint8List _convertFloat32ToInt16(Float32List samples) {
45 final int16Bytes = ByteData(samples.length * 2);
46
47 for (int i = 0; i < samples.length; i++) {
48 // Clamp to [-1, 1] range and scale to Int16
49 final clamped = samples[i].clamp(-1.0, 1.0);
50 final int16Value = (clamped * 32767).toInt();
51 int16Bytes.setInt16(i * 2, int16Value, Endian.little);
52 }
53
54 return int16Bytes.buffer.asUint8List();
55 }
56
57 /// Create a WAV file from Int16 audio data
58 Uint8List _createWavFile(Uint8List audioData, int sampleRate) {
59 const channels = 1;
60 const bitsPerSample = 16;
61 final byteRate = sampleRate * channels * (bitsPerSample ~/ 8);
62 final blockAlign = channels * (bitsPerSample ~/ 8);
63 final dataSize = audioData.length;
64 final fileSize = 36 + dataSize;
65
66 final header = ByteData(44);
67 int offset = 0;
68
69 // RIFF header
70 header.setUint8(offset++, 0x52); // R
71 header.setUint8(offset++, 0x49); // I
72 header.setUint8(offset++, 0x46); // F
73 header.setUint8(offset++, 0x46); // F
74 header.setUint32(offset, fileSize, Endian.little);
75 offset += 4;
76 header.setUint8(offset++, 0x57); // W
77 header.setUint8(offset++, 0x41); // A
78 header.setUint8(offset++, 0x56); // V
79 header.setUint8(offset++, 0x45); // E
80
81 // fmt subchunk
82 header.setUint8(offset++, 0x66); // f
83 header.setUint8(offset++, 0x6D); // m
84 header.setUint8(offset++, 0x74); // t
85 header.setUint8(offset++, 0x20); // space
86 header.setUint32(offset, 16, Endian.little); // Subchunk size
87 offset += 4;
88 header.setUint16(offset, 1, Endian.little); // PCM format
89 offset += 2;
90 header.setUint16(offset, channels, Endian.little);
91 offset += 2;
92 header.setUint32(offset, sampleRate, Endian.little);
93 offset += 4;
94 header.setUint32(offset, byteRate, Endian.little);
95 offset += 4;
96 header.setUint16(offset, blockAlign, Endian.little);
97 offset += 2;
98 header.setUint16(offset, bitsPerSample, Endian.little);
99 offset += 2;
100
101 // data subchunk
102 header.setUint8(offset++, 0x64); // d
103 header.setUint8(offset++, 0x61); // a
104 header.setUint8(offset++, 0x74); // t
105 header.setUint8(offset++, 0x61); // a
106 header.setUint32(offset, dataSize, Endian.little);
107
108 // Combine header and audio data
109 final result = Uint8List(44 + audioData.length);
110 result.setRange(0, 44, header.buffer.asUint8List());
111 result.setRange(44, 44 + audioData.length, audioData);
112
113 return result;
114 }
115
116 /// Dispose of resources
117 void dispose() {
118 _player.dispose();
119 }
120}

Important: The Float32-to-Int16 conversion is essential for standard audio players. The clamping step ensures no overflow during scaling.

Complete TTS View

Create lib/features/tts/text_to_speech_view.dart:

dart
1import 'package:flutter/material.dart';
2import 'package:runanywhere/runanywhere.dart';
3import '../../services/audio_playback_service.dart';
4
5class TextToSpeechView extends StatefulWidget {
6 const TextToSpeechView({super.key});
7
8 @override
9 State<TextToSpeechView> createState() => _TextToSpeechViewState();
10}
11
12class _TextToSpeechViewState extends State<TextToSpeechView> {
13 final TextEditingController _controller = TextEditingController(
14 text: 'Hello! This is text-to-speech running entirely on your device.',
15 );
16 final AudioPlaybackService _audioService = AudioPlaybackService();
17
18 bool _isSynthesizing = false;
19 bool _isModelLoaded = false;
20 double _downloadProgress = 0.0;
21 double _speechRate = 1.0;
22 double _pitch = 1.0;
23
24 @override
25 void initState() {
26 super.initState();
27 _loadModel();
28 }
29
30 Future<void> _loadModel() async {
31 const modelId = 'vits-piper-en_US-lessac-medium';
32
33 final isDownloaded = await RunAnywhere.isModelDownloaded(modelId);
34
35 if (!isDownloaded) {
36 await for (final progress in RunAnywhere.downloadModel(modelId)) {
37 setState(() {
38 _downloadProgress = progress.progress;
39 });
40 if (progress.stage == DownloadStage.completed) break;
41 }
42 }
43
44 await RunAnywhere.loadTTSVoice(modelId);
45 setState(() {
46 _isModelLoaded = true;
47 });
48 }
49
50 Future<void> _synthesizeAndPlay() async {
51 final text = _controller.text.trim();
52 if (text.isEmpty || _isSynthesizing) return;
53
54 setState(() {
55 _isSynthesizing = true;
56 });
57
58 try {
59 final result = await RunAnywhere.synthesize(
60 text,
61 rate: _speechRate,
62 pitch: _pitch,
63 volume: 1.0,
64 );
65
66 debugPrint('Synthesized: ${result.duration.toStringAsFixed(2)}s, ${result.sampleRate}Hz');
67
68 // Play the audio
69 await _audioService.playFloat32Audio(result.samples, result.sampleRate);
70
71 } catch (e) {
72 ScaffoldMessenger.of(context).showSnackBar(
73 SnackBar(content: Text('TTS Error: $e')),
74 );
75 } finally {
76 setState(() {
77 _isSynthesizing = false;
78 });
79 }
80 }
81
82 @override
83 Widget build(BuildContext context) {
84 return Scaffold(
85 appBar: AppBar(
86 title: const Text('Text to Speech'),
87 ),
88 body: Padding(
89 padding: const EdgeInsets.all(24),
90 child: Column(
91 crossAxisAlignment: CrossAxisAlignment.start,
92 children: [
93 // Status indicator
94 if (!_isModelLoaded)
95 Column(
96 children: [
97 const Text('Downloading voice model...'),
98 const SizedBox(height: 8),
99 LinearProgressIndicator(value: _downloadProgress),
100 const SizedBox(height: 24),
101 ],
102 ),
103
104 // Text input
105 TextField(
106 controller: _controller,
107 maxLines: 4,
108 decoration: const InputDecoration(
109 labelText: 'Text to speak',
110 border: OutlineInputBorder(),
111 ),
112 ),
113
114 const SizedBox(height: 24),
115
116 // Speed slider
117 Text('Speed: ${_speechRate.toStringAsFixed(1)}x'),
118 Slider(
119 value: _speechRate,
120 min: 0.5,
121 max: 2.0,
122 divisions: 15,
123 onChanged: (value) => setState(() => _speechRate = value),
124 ),
125
126 const SizedBox(height: 16),
127
128 // Pitch slider
129 Text('Pitch: ${_pitch.toStringAsFixed(1)}'),
130 Slider(
131 value: _pitch,
132 min: 0.5,
133 max: 1.5,
134 divisions: 10,
135 onChanged: (value) => setState(() => _pitch = value),
136 ),
137
138 const SizedBox(height: 32),
139
140 // Speak button
141 SizedBox(
142 width: double.infinity,
143 child: ElevatedButton.icon(
144 onPressed: _isModelLoaded && !_isSynthesizing
145 ? _synthesizeAndPlay
146 : null,
147 icon: Icon(_isSynthesizing ? Icons.hourglass_empty : Icons.volume_up),
148 label: Text(_isSynthesizing ? 'Synthesizing...' : 'Speak'),
149 style: ElevatedButton.styleFrom(
150 padding: const EdgeInsets.all(16),
151 ),
152 ),
153 ),
154 ],
155 ),
156 ),
157 );
158 }
159}
Text-to-speech synthesis and playback controls

Memory Management

When you're done with TTS, unload the voice to free memory:

dart
1// Unload TTS voice
2await RunAnywhere.unloadTTSVoice();

TTS voices can be loaded independently alongside the LLM and STT models—they don't conflict.

Models Reference

Model IDSizeNotes
vits-piper-en_US-lessac-medium~65MBNatural US English

What's Next

In Part 4, we'll combine everything into a complete voice assistant with automatic Voice Activity Detection.


Resources


Questions? Open an issue on GitHub or reach out on Twitter/X.

Frequently Asked Questions

Why don't I hear any audio when I tap Speak?

Do I need to convert Piper's output?

I used loadSTTModel() for Piper and it failed.

The AudioPlaybackService only prints to debug—how do I play audio?

The Piper model download hangs or is slow.

My registerModel call fails with a type error.

Can I load TTS alongside LLM and STT?

How much additional RAM does the Piper model use?

RunAnywhere Logo

RunAnywhere

Connect with developers, share ideas, get support, and stay updated on the latest features. Our Discord community is the heart of everything we build.

Company

Copyright © 2025 RunAnywhere, Inc.