February 2, 2026

·

RunAnywhere React Native SDK Part 3: Text-to-Speech with Piper

RunAnywhere React Native SDK Part 3: Text-to-Speech with Piper
DEVELOPERS

Natural Voice Synthesis Entirely On-Device


This is Part 3 of our RunAnywhere React Native SDK tutorial series:

  1. Chat with LLMs — Project setup and streaming text generation
  2. Speech-to-Text — Real-time transcription with Whisper
  3. Text-to-Speech (this post) — Natural voice synthesis with Piper
  4. Voice Pipeline — Full voice assistant with VAD

Text-to-speech brings your app to life. With RunAnywhere, you can synthesize natural-sounding speech using Piper—completely on-device, with no network latency, working on both iOS and Android.

Like STT, TTS has an audio format consideration: Piper outputs base64-encoded Float32 PCM that needs to be converted for playback.

Prerequisites

  • Complete Part 1 first to set up your project with the RunAnywhere SDK
  • ~65MB additional storage for the Piper voice model

Android Note: A physical ARM64 device is required. Emulators will NOT work. See Part 1's Android Setup for complete configuration instructions.

Register the TTS Voice

Add Piper to your model registration in App.tsx:

typescript
1import { RunAnywhere, ModelCategory } from '@runanywhere/core'
2import { ModelArtifactType } from '@runanywhere/onnx'
3
4// Register TTS voice (Piper)
5RunAnywhere.registerModel({
6 id: 'vits-piper-en_US-lessac-medium',
7 name: 'Piper US English',
8 url: 'https://github.com/RunanywhereAI/sherpa-onnx/releases/download/runanywhere-models-v1/vits-piper-en_US-lessac-medium.tar.gz',
9 framework: 'onnx',
10 modality: ModelCategory.SpeechSynthesis,
11 artifactType: ModelArtifactType.TarGzArchive,
12 memoryRequirement: 65_000_000,
13})

Important: Piper Output Format

Piper outputs audio in a specific format:

ParameterValue
Sample Rate22,050 Hz
Channels1 (mono)
Format32-bit float (Float32) PCM, base64 encoded

The SDK returns base64-encoded float32 samples that need to be decoded and converted for playback.

Loading and Using TTS

Create src/hooks/useTTS.ts:

typescript
1import { useState, useCallback } from 'react'
2import { RunAnywhere } from '@runanywhere/core'
3
4interface TTSResult {
5 audio: string // base64 encoded Float32 PCM
6 sampleRate: number
7 numSamples: number
8 duration: number
9}
10
11export function useTTS() {
12 const [isLoaded, setIsLoaded] = useState(false)
13 const [isLoading, setIsLoading] = useState(false)
14 const [downloadProgress, setDownloadProgress] = useState(0)
15
16 const loadModel = useCallback(async () => {
17 setIsLoading(true)
18 const modelId = 'vits-piper-en_US-lessac-medium'
19
20 try {
21 // Check if already downloaded
22 const isDownloaded = await RunAnywhere.isModelDownloaded(modelId)
23
24 if (!isDownloaded) {
25 await RunAnywhere.downloadModel(modelId, (progress) => {
26 setDownloadProgress(progress.progress)
27 })
28 }
29
30 // Load TTS voice into memory
31 await RunAnywhere.loadTTSVoice(modelId)
32 setIsLoaded(true)
33 console.log('TTS voice loaded successfully')
34 } catch (e) {
35 console.error('TTS load error:', e)
36 throw e
37 } finally {
38 setIsLoading(false)
39 }
40 }, [])
41
42 const synthesize = useCallback(
43 async (
44 text: string,
45 options?: { rate?: number; pitch?: number; volume?: number }
46 ): Promise<TTSResult> => {
47 if (!isLoaded) throw new Error('TTS model not loaded')
48
49 const result = await RunAnywhere.synthesize(text, {
50 voice: 'default',
51 rate: options?.rate ?? 1.0,
52 pitch: options?.pitch ?? 1.0,
53 volume: options?.volume ?? 1.0,
54 })
55
56 return result
57 },
58 [isLoaded]
59 )
60
61 return {
62 isLoaded,
63 isLoading,
64 downloadProgress,
65 loadModel,
66 synthesize,
67 }
68}

Why loadTTSVoice() instead of loadModel()? The SDK uses separate methods for each modality: loadModel() for LLMs, loadSTTModel() for speech-to-text, and loadTTSVoice() for text-to-speech. This reflects that each uses a different runtime (LlamaCPP vs ONNX) and can be loaded simultaneously without conflicts.

Audio Playback

For audio playback, you'll need to convert the base64 float32 data to a playable format. Install the required dependencies:

bash
1npm install react-native-sound react-native-fs @react-native-community/slider
2cd ios && pod install && cd ..

Why base64? React Native's JS bridge can't transfer raw binary data directly between native code and JavaScript. The SDK returns Float32 PCM audio as a base64-encoded string, which you decode on the JS side. This is a React Native-specific consideration—Swift and Kotlin SDKs return raw bytes.

Create src/services/TTSAudioPlayer.ts:

typescript
1import Sound from 'react-native-sound'
2import RNFS from 'react-native-fs'
3
4// Enable playback in silence mode
5Sound.setCategory('Playback')
6
7export class TTSAudioPlayer {
8 private currentSound: Sound | null = null
9
10 async playTTSAudio(base64Audio: string, sampleRate: number): Promise<void> {
11 // Decode base64 to Float32 array
12 const float32Data = this.base64ToFloat32(base64Audio)
13
14 // Convert Float32 to Int16
15 const int16Data = this.float32ToInt16(float32Data)
16
17 // Create WAV file
18 const wavData = this.createWavFile(int16Data, sampleRate)
19
20 // Save to temp file
21 const tempPath = `${RNFS.TemporaryDirectoryPath}/tts_output_${Date.now()}.wav`
22 await RNFS.writeFile(tempPath, wavData, 'base64')
23
24 // Play the file
25 return new Promise((resolve, reject) => {
26 this.currentSound = new Sound(tempPath, '', (error) => {
27 if (error) {
28 reject(error)
29 return
30 }
31
32 this.currentSound?.play((success) => {
33 if (success) {
34 resolve()
35 } else {
36 reject(new Error('Playback failed'))
37 }
38
39 // Cleanup
40 RNFS.unlink(tempPath).catch(() => {})
41 })
42 })
43 })
44 }
45
46 stop(): void {
47 if (this.currentSound) {
48 this.currentSound.stop()
49 this.currentSound.release()
50 this.currentSound = null
51 }
52 }
53
54 private base64ToFloat32(base64: string): Float32Array {
55 const binaryString = atob(base64)
56 const bytes = new Uint8Array(binaryString.length)
57 for (let i = 0; i < binaryString.length; i++) {
58 bytes[i] = binaryString.charCodeAt(i)
59 }
60 return new Float32Array(bytes.buffer)
61 }
62
63 private float32ToInt16(float32: Float32Array): Int16Array {
64 const int16 = new Int16Array(float32.length)
65 for (let i = 0; i < float32.length; i++) {
66 // Clamp to [-1, 1] and scale to Int16
67 const clamped = Math.max(-1, Math.min(1, float32[i]))
68 int16[i] = Math.round(clamped * 32767)
69 }
70 return int16
71 }
72
73 private createWavFile(audioData: Int16Array, sampleRate: number): string {
74 const channels = 1
75 const bitsPerSample = 16
76 const byteRate = sampleRate * channels * (bitsPerSample / 8)
77 const blockAlign = channels * (bitsPerSample / 8)
78 const dataSize = audioData.length * 2 // Int16 = 2 bytes
79 const fileSize = 36 + dataSize
80
81 // Create header (44 bytes)
82 const header = new ArrayBuffer(44)
83 const view = new DataView(header)
84
85 // RIFF header
86 this.writeString(view, 0, 'RIFF')
87 view.setUint32(4, fileSize, true)
88 this.writeString(view, 8, 'WAVE')
89
90 // fmt subchunk
91 this.writeString(view, 12, 'fmt ')
92 view.setUint32(16, 16, true) // Subchunk size
93 view.setUint16(20, 1, true) // PCM format
94 view.setUint16(22, channels, true)
95 view.setUint32(24, sampleRate, true)
96 view.setUint32(28, byteRate, true)
97 view.setUint16(32, blockAlign, true)
98 view.setUint16(34, bitsPerSample, true)
99
100 // data subchunk
101 this.writeString(view, 36, 'data')
102 view.setUint32(40, dataSize, true)
103
104 // Combine header and audio data
105 const wavBuffer = new ArrayBuffer(44 + dataSize)
106 const wavView = new Uint8Array(wavBuffer)
107 wavView.set(new Uint8Array(header), 0)
108
109 // Write audio data
110 const audioBytes = new Uint8Array(audioData.buffer)
111 wavView.set(audioBytes, 44)
112
113 // Convert to base64
114 let binary = ''
115 for (let i = 0; i < wavView.length; i++) {
116 binary += String.fromCharCode(wavView[i])
117 }
118 return btoa(binary)
119 }
120
121 private writeString(view: DataView, offset: number, str: string): void {
122 for (let i = 0; i < str.length; i++) {
123 view.setUint8(offset + i, str.charCodeAt(i))
124 }
125 }
126}

Complete TTS Screen

Create src/screens/TTSScreen.tsx:

typescript
1import React, { useState, useEffect } from 'react';
2import {
3 View,
4 Text,
5 TextInput,
6 TouchableOpacity,
7 StyleSheet,
8} from 'react-native';
9import Slider from '@react-native-community/slider';
10import { useTTS } from '../hooks/useTTS';
11import { TTSAudioPlayer } from '../services/TTSAudioPlayer';
12
13const audioPlayer = new TTSAudioPlayer();
14
15export function TTSScreen() {
16 const [inputText, setInputText] = useState(
17 'Hello! This is text-to-speech running entirely on your device.'
18 );
19 const [isSynthesizing, setIsSynthesizing] = useState(false);
20 const [speechRate, setSpeechRate] = useState(1.0);
21 const [pitch, setPitch] = useState(1.0);
22
23 const { isLoaded, isLoading, downloadProgress, loadModel, synthesize } = useTTS();
24
25 useEffect(() => {
26 loadModel();
27 }, [loadModel]);
28
29 async function synthesizeAndPlay() {
30 if (!inputText.trim() || isSynthesizing) return;
31
32 setIsSynthesizing(true);
33
34 try {
35 const result = await synthesize(inputText, {
36 rate: speechRate,
37 pitch: pitch,
38 volume: 1.0,
39 });
40
41 console.log(`Synthesized: ${result.duration.toFixed(2)}s`);
42
43 await audioPlayer.playTTSAudio(result.audio, result.sampleRate);
44
45 } catch (e) {
46 console.error('TTS error:', e);
47 } finally {
48 setIsSynthesizing(false);
49 }
50 }
51
52 if (isLoading) {
53 return (
54 <View style={styles.container}>
55 <Text style={styles.statusText}>
56 Downloading voice model... {(downloadProgress * 100).toFixed(0)}%
57 </Text>
58 <View style={styles.progressBar}>
59 <View style={[styles.progressFill, { width: `${downloadProgress * 100}%` }]} />
60 </View>
61 </View>
62 );
63 }
64
65 return (
66 <View style={styles.container}>
67 {/* Text input */}
68 <TextInput
69 style={styles.textInput}
70 value={inputText}
71 onChangeText={setInputText}
72 placeholder="Enter text to speak..."
73 placeholderTextColor="#666"
74 multiline
75 numberOfLines={4}
76 />
77
78 {/* Speed slider */}
79 <View style={styles.sliderContainer}>
80 <Text style={styles.sliderLabel}>Speed: {speechRate.toFixed(1)}x</Text>
81 <Slider
82 style={styles.slider}
83 minimumValue={0.5}
84 maximumValue={2.0}
85 step={0.1}
86 value={speechRate}
87 onValueChange={setSpeechRate}
88 minimumTrackTintColor="#007AFF"
89 maximumTrackTintColor="#333"
90 thumbTintColor="#007AFF"
91 />
92 </View>
93
94 {/* Pitch slider */}
95 <View style={styles.sliderContainer}>
96 <Text style={styles.sliderLabel}>Pitch: {pitch.toFixed(1)}</Text>
97 <Slider
98 style={styles.slider}
99 minimumValue={0.5}
100 maximumValue={1.5}
101 step={0.1}
102 value={pitch}
103 onValueChange={setPitch}
104 minimumTrackTintColor="#007AFF"
105 maximumTrackTintColor="#333"
106 thumbTintColor="#007AFF"
107 />
108 </View>
109
110 {/* Speak button */}
111 <TouchableOpacity
112 style={[styles.speakButton, (!isLoaded || isSynthesizing) && styles.disabled]}
113 onPress={synthesizeAndPlay}
114 disabled={!isLoaded || isSynthesizing}
115 >
116 <Text style={styles.speakButtonText}>
117 {isSynthesizing ? 'Synthesizing...' : '🔊 Speak'}
118 </Text>
119 </TouchableOpacity>
120 </View>
121 );
122}
123
124const styles = StyleSheet.create({
125 container: {
126 flex: 1,
127 backgroundColor: '#000',
128 padding: 24,
129 },
130 statusText: {
131 color: '#fff',
132 fontSize: 16,
133 marginBottom: 16,
134 textAlign: 'center',
135 },
136 progressBar: {
137 width: '100%',
138 height: 8,
139 backgroundColor: '#333',
140 borderRadius: 4,
141 overflow: 'hidden',
142 },
143 progressFill: {
144 height: '100%',
145 backgroundColor: '#007AFF',
146 },
147 textInput: {
148 backgroundColor: '#111',
149 borderRadius: 12,
150 padding: 16,
151 color: '#fff',
152 fontSize: 16,
153 minHeight: 120,
154 textAlignVertical: 'top',
155 },
156 sliderContainer: {
157 marginTop: 24,
158 },
159 sliderLabel: {
160 color: '#fff',
161 fontSize: 14,
162 marginBottom: 8,
163 },
164 slider: {
165 width: '100%',
166 height: 40,
167 },
168 speakButton: {
169 backgroundColor: '#007AFF',
170 borderRadius: 12,
171 padding: 16,
172 marginTop: 32,
173 alignItems: 'center',
174 },
175 speakButtonText: {
176 color: '#fff',
177 fontSize: 18,
178 fontWeight: '600',
179 },
180 disabled: {
181 opacity: 0.5,
182 },
183});
Text-to-speech synthesis and playback controls

Memory Management

When you're done with TTS, unload the voice to free memory:

typescript
1// Unload TTS voice to free memory
2await RunAnywhere.unloadTTSVoice()

TTS voices can be loaded independently alongside the LLM and STT models—they don't conflict.

Models Reference

Model IDSizeNotes
vits-piper-en_US-lessac-medium~65MBNatural US English

What's Next

In Part 4, we'll combine everything into a complete voice assistant with automatic Voice Activity Detection.


Resources


Questions? Open an issue on GitHub or reach out on Twitter/X.

RunAnywhere Logo

RunAnywhere

Connect with developers, share ideas, get support, and stay updated on the latest features. Our Discord community is the heart of everything we build.

Company

Copyright © 2025 RunAnywhere, Inc.