February 3, 2026

·

RunAnywhere React Native SDK Part 4: Building a Voice Assistant with VAD

RunAnywhere React Native SDK Part 4: Building a Voice Assistant with VAD
DEVELOPERS

A Complete Voice Assistant Running Entirely On-Device


This is Part 4 of our RunAnywhere React Native SDK tutorial series:

  1. Chat with LLMs — Project setup and streaming text generation
  2. Speech-to-Text — Real-time transcription with Whisper
  3. Text-to-Speech — Natural voice synthesis with Piper
  4. Voice Pipeline (this post) — Full voice assistant with VAD

This is the culmination of the series: a voice assistant that automatically detects when you stop speaking, processes your request with an LLM, and responds with synthesized speech—all running on-device across iOS and Android.

Prerequisites

  • Complete Parts 1-3 to have all three model types (LLM, STT, TTS) working in your project
  • Physical device required — the pipeline uses microphone input
  • All three models downloaded (~390MB total: 250 + 75 + 65)

Android Note: A physical ARM64 device is required. Emulators will NOT work. See Part 1's Android Setup for complete configuration instructions.

The Voice Pipeline Flow

text
1┌─────────────────────────────────────────────────────────────────┐
2│ Voice Assistant Pipeline │
3├─────────────────────────────────────────────────────────────────┤
4│ │
5│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
6│ │ Record │ -> │ STT │ -> │ LLM │ -> │ TTS │ │
7│ │ + VAD │ │ Whisper │ │ LFM2 │ │ Piper │ │
8│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
9│ │ │ │
10│ │ Auto-stop when │ │
11│ └────────── silence detected ────────────────┘ │
12│ │
13└─────────────────────────────────────────────────────────────────┘

Pipeline State Machine

Create src/hooks/useVoicePipeline.ts:

typescript
1import { useState, useCallback, useRef } from 'react'
2import { RunAnywhere } from '@runanywhere/core'
3import { AudioService } from '../services/AudioService'
4import { TTSAudioPlayer } from '../services/TTSAudioPlayer'
5
6// --- Energy-Based Voice Activity Detector ---
7// Monitors audio input levels to detect speech start and end.
8
9const SPEECH_THRESHOLD = 0.02 // Level to detect speech start
10const SILENCE_THRESHOLD = 0.01 // Level to detect speech end
11const SILENCE_DURATION_MS = 1500 // Milliseconds of silence before auto-stop
12
13class VoiceActivityDetector {
14 private isSpeechDetected = false
15 private silenceStartTime: number | null = null
16 private vadInterval: NodeJS.Timeout | null = null
17
18 onSpeechEnded: (() => void) | null = null
19
20 startMonitoring() {
21 this.isSpeechDetected = false
22 this.silenceStartTime = null
23
24 this.vadInterval = setInterval(() => {
25 const level = AudioService.getInputLevel()
26
27 // Detect speech start
28 if (!this.isSpeechDetected && level > SPEECH_THRESHOLD) {
29 this.isSpeechDetected = true
30 this.silenceStartTime = null
31 console.log('[VAD] Speech detected')
32 }
33
34 // Detect speech end (only after speech was detected)
35 if (this.isSpeechDetected) {
36 if (level < SILENCE_THRESHOLD) {
37 if (this.silenceStartTime === null) {
38 this.silenceStartTime = Date.now()
39 } else if (Date.now() - this.silenceStartTime >= SILENCE_DURATION_MS) {
40 console.log('[VAD] Auto-stopping after silence')
41 this.stopMonitoring()
42 this.onSpeechEnded?.()
43 }
44 } else {
45 this.silenceStartTime = null // Speech resumed
46 }
47 }
48 }, 100) // Check every 100ms
49 }
50
51 stopMonitoring() {
52 if (this.vadInterval) {
53 clearInterval(this.vadInterval)
54 this.vadInterval = null
55 }
56 }
57}
58
59// --- Pipeline Hook ---
60
61export type PipelineState = 'idle' | 'listening' | 'transcribing' | 'thinking' | 'speaking'
62
63export function useVoicePipeline() {
64 const [state, setState] = useState<PipelineState>('idle')
65 const [transcribedText, setTranscribedText] = useState('')
66 const [responseText, setResponseText] = useState('')
67 const [error, setError] = useState<string | null>(null)
68
69 const audioPlayerRef = useRef(new TTSAudioPlayer())
70 const vadRef = useRef(new VoiceActivityDetector())
71
72 const isReady = useCallback(async (): Promise<boolean> => {
73 const isLLMLoaded = await RunAnywhere.isModelLoaded()
74 const isSTTLoaded = await RunAnywhere.isSTTModelLoaded()
75 const isTTSLoaded = await RunAnywhere.isTTSVoiceLoaded()
76 return isLLMLoaded && isSTTLoaded && isTTSLoaded
77 }, [])
78
79 const processRecording = useCallback(async () => {
80 // 1. Stop recording
81 setState('transcribing')
82
83 try {
84 const audioData = await AudioService.stopRecording()
85
86 // 2. Transcribe
87 const userText = await RunAnywhere.transcribe(audioData)
88 setTranscribedText(userText)
89
90 if (!userText.trim()) {
91 setState('idle')
92 return
93 }
94
95 // 3. Generate LLM response
96 setState('thinking')
97
98 const prompt = `You are a helpful voice assistant. Keep responses SHORT (2-3 sentences max).
99Be conversational and friendly.
100
101User: ${userText}
102Assistant:`
103
104 const streamResult = await RunAnywhere.generateStream(prompt, {
105 maxTokens: 100,
106 temperature: 0.7,
107 })
108
109 let response = ''
110 for await (const token of streamResult.stream) {
111 response += token
112 setResponseText(response)
113 }
114
115 // 4. Speak the response
116 setState('speaking')
117
118 const ttsResult = await RunAnywhere.synthesize(response, {
119 rate: 1.0,
120 pitch: 1.0,
121 volume: 1.0,
122 })
123
124 await audioPlayerRef.current.playTTSAudio(ttsResult.audio, ttsResult.sampleRate)
125 } catch (e) {
126 console.error('Pipeline error:', e)
127 setError(e instanceof Error ? e.message : 'Unknown error')
128 }
129
130 setState('idle')
131 }, [])
132
133 const start = useCallback(async () => {
134 if (state !== 'idle') return
135
136 const ready = await isReady()
137 if (!ready) {
138 setError('Models not loaded. Please load LLM, STT, and TTS first.')
139 return
140 }
141
142 setState('listening')
143 setTranscribedText('')
144 setResponseText('')
145 setError(null)
146
147 try {
148 await AudioService.initialize()
149 AudioService.startRecording()
150
151 // Start energy-based VAD monitoring
152 vadRef.current.onSpeechEnded = () => {
153 processRecording()
154 }
155 vadRef.current.startMonitoring()
156 } catch (e) {
157 setError(e instanceof Error ? e.message : 'Failed to start')
158 setState('idle')
159 }
160 }, [state, isReady, processRecording])
161
162 const stopManually = useCallback(async () => {
163 vadRef.current.stopMonitoring()
164 await processRecording()
165 }, [processRecording])
166
167 const cancel = useCallback(() => {
168 vadRef.current.stopMonitoring()
169 audioPlayerRef.current.stop()
170 setState('idle')
171 }, [])
172
173 return {
174 state,
175 transcribedText,
176 responseText,
177 error,
178 start,
179 stopManually,
180 cancel,
181 isReady,
182 }
183}

AudioService.getInputLevel(): You need to add a getInputLevel() static method to the AudioService from Part 2. This returns the current RMS audio amplitude (0.0 to 1.0) so the VAD can monitor input levels:

typescript
1// Add to AudioService from Part 2
2static getInputLevel(): number {
3 // Calculate RMS from the current recording buffer
4 if (!this.currentBuffer || this.currentBuffer.length === 0) return 0
5 const samples = this.currentBuffer
6 let sum = 0
7 for (let i = 0; i < samples.length; i++) {
8 sum += samples[i] * samples[i]
9 }
10 return Math.sqrt(sum / samples.length)
11}

Voice Pipeline Screen

Create src/screens/VoiceAssistantScreen.tsx:

typescript
1import React, { useEffect, useState } from 'react';
2import {
3 View,
4 Text,
5 TouchableOpacity,
6 StyleSheet,
7} from 'react-native';
8import { useVoicePipeline, PipelineState } from '../hooks/useVoicePipeline';
9
10export function VoiceAssistantScreen() {
11 const {
12 state,
13 transcribedText,
14 responseText,
15 error,
16 start,
17 stopManually,
18 isReady,
19 } = useVoicePipeline();
20
21 const [modelsReady, setModelsReady] = useState(false);
22
23 useEffect(() => {
24 isReady().then(setModelsReady);
25 }, [isReady]);
26
27 function getStateColor(): string {
28 switch (state) {
29 case 'idle': return '#666';
30 case 'listening': return '#ff4444';
31 case 'transcribing':
32 case 'thinking': return '#ffaa00';
33 case 'speaking': return '#44ff44';
34 default: return '#666';
35 }
36 }
37
38 function getStateText(): string {
39 switch (state) {
40 case 'idle': return 'Ready';
41 case 'listening': return 'Listening...';
42 case 'transcribing': return 'Transcribing...';
43 case 'thinking': return 'Thinking...';
44 case 'speaking': return 'Speaking...';
45 default: return 'Ready';
46 }
47 }
48
49 function getStateHint(): string {
50 switch (state) {
51 case 'idle': return 'Tap to start';
52 case 'listening': return 'Stops automatically when you pause';
53 case 'transcribing': return 'Converting speech to text...';
54 case 'thinking': return 'Generating response...';
55 case 'speaking': return 'Playing audio response...';
56 default: return '';
57 }
58 }
59
60 function handleButtonPress() {
61 if (state === 'idle') {
62 start();
63 } else if (state === 'listening') {
64 stopManually();
65 }
66 }
67
68 return (
69 <View style={styles.container}>
70 {/* State indicator */}
71 <View style={styles.stateIndicator}>
72 <View style={[styles.stateDot, { backgroundColor: getStateColor() }]} />
73 <Text style={styles.stateText}>{getStateText()}</Text>
74 </View>
75
76 {/* Error message */}
77 {error && (
78 <View style={styles.errorBox}>
79 <Text style={styles.errorText}>{error}</Text>
80 </View>
81 )}
82
83 {/* Transcription */}
84 {transcribedText !== '' && (
85 <View style={[styles.bubble, styles.userBubble]}>
86 <Text style={styles.bubbleLabel}>You said:</Text>
87 <Text style={styles.bubbleText}>{transcribedText}</Text>
88 </View>
89 )}
90
91 {/* Response */}
92 {responseText !== '' && (
93 <View style={[styles.bubble, styles.assistantBubble]}>
94 <Text style={styles.bubbleLabel}>Assistant:</Text>
95 <Text style={styles.bubbleText}>{responseText}</Text>
96 </View>
97 )}
98
99 <View style={styles.spacer} />
100
101 {/* Main button */}
102 <TouchableOpacity
103 style={[
104 styles.mainButton,
105 state === 'idle' ? styles.buttonIdle : styles.buttonActive,
106 ]}
107 onPress={handleButtonPress}
108 disabled={!modelsReady || (state !== 'idle' && state !== 'listening')}
109 >
110 <Text style={styles.buttonIcon}>
111 {state === 'idle' ? '🎤' : '⬛'}
112 </Text>
113 </TouchableOpacity>
114
115 <Text style={styles.hintText}>{getStateHint()}</Text>
116
117 {!modelsReady && (
118 <Text style={styles.warningText}>
119 Please load LLM, STT, and TTS models first
120 </Text>
121 )}
122 </View>
123 );
124}
125
126const styles = StyleSheet.create({
127 container: {
128 flex: 1,
129 backgroundColor: '#000',
130 padding: 24,
131 alignItems: 'center',
132 },
133 stateIndicator: {
134 flexDirection: 'row',
135 alignItems: 'center',
136 marginBottom: 24,
137 },
138 stateDot: {
139 width: 12,
140 height: 12,
141 borderRadius: 6,
142 marginRight: 8,
143 },
144 stateText: {
145 color: '#fff',
146 fontSize: 18,
147 fontWeight: '500',
148 },
149 errorBox: {
150 backgroundColor: 'rgba(255, 68, 68, 0.1)',
151 borderRadius: 8,
152 padding: 12,
153 marginBottom: 16,
154 width: '100%',
155 },
156 errorText: {
157 color: '#ff4444',
158 textAlign: 'center',
159 },
160 bubble: {
161 width: '100%',
162 padding: 16,
163 borderRadius: 12,
164 marginBottom: 16,
165 },
166 userBubble: {
167 backgroundColor: 'rgba(0, 122, 255, 0.1)',
168 },
169 assistantBubble: {
170 backgroundColor: 'rgba(68, 255, 68, 0.1)',
171 },
172 bubbleLabel: {
173 color: '#888',
174 fontSize: 12,
175 marginBottom: 4,
176 },
177 bubbleText: {
178 color: '#fff',
179 fontSize: 16,
180 },
181 spacer: {
182 flex: 1,
183 },
184 mainButton: {
185 width: 100,
186 height: 100,
187 borderRadius: 50,
188 justifyContent: 'center',
189 alignItems: 'center',
190 },
191 buttonIdle: {
192 backgroundColor: '#007AFF',
193 },
194 buttonActive: {
195 backgroundColor: '#ff4444',
196 },
197 buttonIcon: {
198 fontSize: 36,
199 },
200 hintText: {
201 color: '#666',
202 fontSize: 12,
203 marginTop: 16,
204 },
205 warningText: {
206 color: '#ffaa00',
207 fontSize: 12,
208 marginTop: 8,
209 },
210});
Voice assistant pipeline in action

Best Practices

1. Preload Models on App Start

typescript
1// In App.tsx or a dedicated initialization screen
2async function preloadModels() {
3 await downloadAndLoadLLM('lfm2-350m-q4_k_m')
4 await downloadAndLoadSTT('sherpa-onnx-whisper-tiny.en')
5 await downloadAndLoadTTS('vits-piper-en_US-lessac-medium')
6}

2. Audio Format Summary

ComponentSample RateFormatChannels
Recording16,000 HzInt161
Whisper STT16,000 HzInt161
Piper TTS Output22,050 HzFloat32 (base64)1
Audio PlaybackAnyWAV/Int161-2

3. Check Model State

typescript
1async function isVoiceAgentReady(): Promise<boolean> {
2 const [llm, stt, tts] = await Promise.all([
3 RunAnywhere.isModelLoaded(),
4 RunAnywhere.isSTTModelLoaded(),
5 RunAnywhere.isTTSVoiceLoaded(),
6 ])
7 return llm && stt && tts
8}

4. Prevent Concurrent Operations

typescript
1const start = useCallback(async () => {
2 if (state !== 'idle') return // Prevent double-starts
3 // ...
4}, [state])

5. Tune VAD for Your Environment

The default thresholds work for quiet environments. Adjust for noisy settings:

typescript
1const SPEECH_THRESHOLD = 0.05 // Higher for noisy environments
2const SILENCE_THRESHOLD = 0.02 // Higher for noisy environments
3const SILENCE_DURATION_MS = 2000 // Longer pause tolerance

Models Reference

TypeModel IDSizeNotes
LLMlfm2-350m-q4_k_m~250MBLiquidAI, fast, efficient
STTsherpa-onnx-whisper-tiny.en~75MBEnglish
TTSvits-piper-en_US-lessac-medium~65MBUS English

Conclusion

You've built a complete voice assistant that:

  • Listens with automatic speech detection
  • Transcribes using on-device Whisper
  • Thinks with a local LLM
  • Responds with natural TTS

All processing happens on-device. No data ever leaves the phone. No API keys. No cloud costs. And it works on both iOS and Android from a single codebase.

This is the future of private, cross-platform voice AI.


Complete Source Code

The full source code is available on GitHub:

React Native Starter App

Includes:

  • Complete React Native app with all features
  • TypeScript throughout
  • Zustand state management
  • Tab navigation

Resources


Questions? Open an issue on GitHub or reach out on Twitter/X.

RunAnywhere Logo

RunAnywhere

Connect with developers, share ideas, get support, and stay updated on the latest features. Our Discord community is the heart of everything we build.

Company

Copyright © 2025 RunAnywhere, Inc.