February 7, 2026

·

RunAnywhere Flutter SDK Part 4: Building a Voice Assistant with VAD

RunAnywhere Flutter SDK Part 4: Building a Voice Assistant with VAD
DEVELOPERS

A Complete Voice Assistant Running Entirely On-Device


This is Part 4 of our RunAnywhere Flutter SDK tutorial series:

  1. Chat with LLMs — Project setup and streaming text generation
  2. Speech-to-Text — Real-time transcription with Whisper
  3. Text-to-Speech — Natural voice synthesis with Piper
  4. Voice Pipeline (this post) — Full voice assistant with VAD

This is the culmination of the series: a voice assistant that automatically detects when you stop speaking, processes your request with an LLM, and responds with synthesized speech—all running on-device across iOS and Android.

The key feature is Voice Activity Detection (VAD): the assistant knows when you've finished speaking without requiring a button press.

Prerequisites

  • Complete Parts 1-3 to have all three model types (LLM, STT, TTS) working in your project
  • Physical device required — the pipeline uses microphone input
  • All three models downloaded (~390MB total: 250 + 75 + 65)

The Voice Pipeline Flow

text
1┌─────────────────────────────────────────────────────────────────┐
2│ Voice Assistant Pipeline │
3├─────────────────────────────────────────────────────────────────┤
4│ │
5│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
6│ │ Record │ -> │ STT │ -> │ LLM │ -> │ TTS │ │
7│ │ + VAD │ │ Whisper │ │ LFM2 │ │ Piper │ │
8│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
9│ │ │ │
10│ │ Auto-stop when │ │
11│ └────────── silence detected ────────────────┘ │
12│ │
13└─────────────────────────────────────────────────────────────────┘

Pipeline State Machine

Create lib/features/voice/voice_pipeline.dart:

dart
1import 'dart:async';
2import 'dart:typed_data';
3import 'package:flutter/foundation.dart';
4import 'package:runanywhere/runanywhere.dart';
5import '../../services/audio_recording_service.dart';
6import '../../services/audio_playback_service.dart';
7
8enum PipelineState {
9 idle,
10 listening,
11 transcribing,
12 thinking,
13 speaking,
14}
15
16class VoicePipeline extends ChangeNotifier {
17 final AudioRecordingService _audioService = AudioRecordingService();
18 final AudioPlaybackService _playbackService = AudioPlaybackService();
19
20 PipelineState _state = PipelineState.idle;
21 String _transcribedText = '';
22 String _responseText = '';
23 String? _errorMessage;
24 Timer? _vadTimer;
25
26 // VAD thresholds (tune these for your environment)
27 static const double speechThreshold = 0.02; // Level to detect speech start
28 static const double silenceThreshold = 0.01; // Level to detect speech end
29 static const double silenceDuration = 1.5; // Seconds of silence before auto-stop
30
31 // VAD state
32 bool _isSpeechDetected = false;
33 DateTime? _silenceStartTime;
34
35 PipelineState get state => _state;
36 String get transcribedText => _transcribedText;
37 String get responseText => _responseText;
38 String? get errorMessage => _errorMessage;
39
40 Future<void> start() async {
41 if (_state != PipelineState.idle) return;
42
43 // Ensure all models are loaded
44 final isReady = await _isReady();
45 if (!isReady) {
46 _errorMessage = 'Models not loaded. Please load LLM, STT, and TTS first.';
47 notifyListeners();
48 return;
49 }
50
51 _state = PipelineState.listening;
52 _transcribedText = '';
53 _responseText = '';
54 _errorMessage = null;
55 notifyListeners();
56
57 try {
58 await _audioService.startRecording();
59
60 // Start energy-based VAD monitoring
61 _startVADMonitoring();
62
63 } catch (e) {
64 _errorMessage = e.toString();
65 _state = PipelineState.idle;
66 notifyListeners();
67 }
68 }
69
70 void _startVADMonitoring() {
71 _isSpeechDetected = false;
72 _silenceStartTime = null;
73
74 _vadTimer = Timer.periodic(
75 const Duration(milliseconds: 100),
76 (_) => _checkAudioLevel(),
77 );
78 }
79
80 void _checkAudioLevel() {
81 final amplitude = _audioService.getAmplitude();
82
83 // Detect speech start
84 if (!_isSpeechDetected && amplitude > speechThreshold) {
85 _isSpeechDetected = true;
86 _silenceStartTime = null;
87 debugPrint('Speech detected');
88 }
89
90 // Detect speech end (only after speech was detected)
91 if (_isSpeechDetected) {
92 if (amplitude < silenceThreshold) {
93 _silenceStartTime ??= DateTime.now();
94
95 final elapsed = DateTime.now().difference(_silenceStartTime!).inMilliseconds;
96 if (elapsed >= (silenceDuration * 1000).toInt()) {
97 debugPrint('Auto-stopping after silence');
98 _stopVADMonitoring();
99 _processRecording();
100 }
101 } else {
102 _silenceStartTime = null; // Speech resumed
103 }
104 }
105 }
106
107 void _stopVADMonitoring() {
108 _vadTimer?.cancel();
109 _vadTimer = null;
110 }
111
112 Future<void> stopManually() async {
113 _stopVADMonitoring();
114 await _processRecording();
115 }
116
117 Future<void> _processRecording() async {
118 if (_state != PipelineState.listening) return;
119
120 // 1. Stop recording and get audio
121 _state = PipelineState.transcribing;
122 notifyListeners();
123
124 try {
125 final audioData = await _audioService.stopRecording();
126
127 if (audioData == null || audioData.isEmpty) {
128 _state = PipelineState.idle;
129 notifyListeners();
130 return;
131 }
132
133 // 2. Transcribe
134 final text = await RunAnywhere.transcribe(audioData);
135 _transcribedText = text;
136 notifyListeners();
137
138 if (text.trim().isEmpty) {
139 _state = PipelineState.idle;
140 notifyListeners();
141 return;
142 }
143
144 // 3. Generate LLM response
145 _state = PipelineState.thinking;
146 notifyListeners();
147
148 final prompt = '''
149You are a helpful voice assistant. Keep responses SHORT (2-3 sentences max).
150Be conversational and friendly.
151
152User: $text
153Assistant:''';
154
155 final options = LLMGenerationOptions(
156 maxTokens: 100,
157 temperature: 0.7,
158 );
159
160 final streamResult = await RunAnywhere.generateStream(prompt, options: options);
161
162 String response = '';
163 await for (final token in streamResult.stream) {
164 response += token;
165 _responseText = response;
166 notifyListeners();
167 }
168
169 // 4. Speak the response
170 _state = PipelineState.speaking;
171 notifyListeners();
172
173 final ttsResult = await RunAnywhere.synthesize(
174 response,
175 rate: 1.0,
176 pitch: 1.0,
177 volume: 1.0,
178 );
179
180 await _playbackService.playFloat32Audio(
181 ttsResult.samples,
182 ttsResult.sampleRate,
183 );
184
185 // Wait for audio to finish (approximate)
186 await Future.delayed(Duration(
187 milliseconds: (ttsResult.duration * 1000).toInt() + 500,
188 ));
189
190 } catch (e) {
191 debugPrint('Pipeline error: $e');
192 _errorMessage = e.toString();
193 }
194
195 _state = PipelineState.idle;
196 notifyListeners();
197 }
198
199 Future<bool> _isReady() async {
200 return RunAnywhere.isModelLoaded &&
201 RunAnywhere.isSTTModelLoaded &&
202 RunAnywhere.isTTSVoiceLoaded;
203 }
204
205 @override
206 void dispose() {
207 _stopVADMonitoring();
208 _audioService.dispose();
209 super.dispose();
210 }
211}

Voice Pipeline UI

Create lib/features/voice/voice_assistant_view.dart:

dart
1import 'package:flutter/material.dart';
2import 'package:provider/provider.dart';
3import 'voice_pipeline.dart';
4
5class VoiceAssistantView extends StatelessWidget {
6 const VoiceAssistantView({super.key});
7
8 @override
9 Widget build(BuildContext context) {
10 return ChangeNotifierProvider(
11 create: (_) => VoicePipeline(),
12 child: const _VoiceAssistantContent(),
13 );
14 }
15}
16
17class _VoiceAssistantContent extends StatelessWidget {
18 const _VoiceAssistantContent();
19
20 @override
21 Widget build(BuildContext context) {
22 final pipeline = context.watch<VoicePipeline>();
23
24 return Scaffold(
25 appBar: AppBar(
26 title: const Text('Voice Assistant'),
27 ),
28 body: Padding(
29 padding: const EdgeInsets.all(24),
30 child: Column(
31 children: [
32 // State indicator
33 _StateIndicator(state: pipeline.state),
34
35 const SizedBox(height: 24),
36
37 // Error message
38 if (pipeline.errorMessage != null)
39 Container(
40 padding: const EdgeInsets.all(12),
41 decoration: BoxDecoration(
42 color: Colors.red.withOpacity(0.1),
43 borderRadius: BorderRadius.circular(8),
44 ),
45 child: Text(
46 pipeline.errorMessage!,
47 style: const TextStyle(color: Colors.red),
48 ),
49 ),
50
51 // Transcription
52 if (pipeline.transcribedText.isNotEmpty)
53 _ConversationBubble(
54 label: 'You said:',
55 text: pipeline.transcribedText,
56 color: Colors.blue,
57 ),
58
59 const SizedBox(height: 16),
60
61 // Response
62 if (pipeline.responseText.isNotEmpty)
63 _ConversationBubble(
64 label: 'Assistant:',
65 text: pipeline.responseText,
66 color: Colors.green,
67 ),
68
69 const Spacer(),
70
71 // Main button
72 _MainButton(
73 state: pipeline.state,
74 onPressed: () {
75 if (pipeline.state == PipelineState.idle) {
76 pipeline.start();
77 } else if (pipeline.state == PipelineState.listening) {
78 pipeline.stopManually();
79 }
80 },
81 ),
82
83 const SizedBox(height: 16),
84
85 Text(
86 _getStateHint(pipeline.state),
87 style: TextStyle(
88 color: Colors.grey[600],
89 fontSize: 12,
90 ),
91 ),
92 ],
93 ),
94 ),
95 );
96 }
97
98 String _getStateHint(PipelineState state) {
99 switch (state) {
100 case PipelineState.idle:
101 return 'Tap to start';
102 case PipelineState.listening:
103 return 'Stops automatically when you pause';
104 case PipelineState.transcribing:
105 return 'Converting speech to text...';
106 case PipelineState.thinking:
107 return 'Generating response...';
108 case PipelineState.speaking:
109 return 'Playing audio response...';
110 }
111 }
112}
113
114class _StateIndicator extends StatelessWidget {
115 final PipelineState state;
116
117 const _StateIndicator({required this.state});
118
119 @override
120 Widget build(BuildContext context) {
121 return Row(
122 mainAxisAlignment: MainAxisAlignment.center,
123 children: [
124 Container(
125 width: 12,
126 height: 12,
127 decoration: BoxDecoration(
128 shape: BoxShape.circle,
129 color: _getStateColor(),
130 ),
131 ),
132 const SizedBox(width: 8),
133 Text(
134 _getStateText(),
135 style: const TextStyle(
136 fontSize: 16,
137 fontWeight: FontWeight.w500,
138 ),
139 ),
140 ],
141 );
142 }
143
144 Color _getStateColor() {
145 switch (state) {
146 case PipelineState.idle:
147 return Colors.grey;
148 case PipelineState.listening:
149 return Colors.red;
150 case PipelineState.transcribing:
151 case PipelineState.thinking:
152 return Colors.orange;
153 case PipelineState.speaking:
154 return Colors.green;
155 }
156 }
157
158 String _getStateText() {
159 switch (state) {
160 case PipelineState.idle:
161 return 'Ready';
162 case PipelineState.listening:
163 return 'Listening...';
164 case PipelineState.transcribing:
165 return 'Transcribing...';
166 case PipelineState.thinking:
167 return 'Thinking...';
168 case PipelineState.speaking:
169 return 'Speaking...';
170 }
171 }
172}
173
174class _ConversationBubble extends StatelessWidget {
175 final String label;
176 final String text;
177 final Color color;
178
179 const _ConversationBubble({
180 required this.label,
181 required this.text,
182 required this.color,
183 });
184
185 @override
186 Widget build(BuildContext context) {
187 return Container(
188 width: double.infinity,
189 padding: const EdgeInsets.all(16),
190 decoration: BoxDecoration(
191 color: color.withOpacity(0.1),
192 borderRadius: BorderRadius.circular(12),
193 ),
194 child: Column(
195 crossAxisAlignment: CrossAxisAlignment.start,
196 children: [
197 Text(
198 label,
199 style: TextStyle(
200 color: Colors.grey[600],
201 fontSize: 12,
202 ),
203 ),
204 const SizedBox(height: 4),
205 Text(
206 text,
207 style: const TextStyle(fontSize: 16),
208 ),
209 ],
210 ),
211 );
212 }
213}
214
215class _MainButton extends StatelessWidget {
216 final PipelineState state;
217 final VoidCallback onPressed;
218
219 const _MainButton({
220 required this.state,
221 required this.onPressed,
222 });
223
224 @override
225 Widget build(BuildContext context) {
226 final isActive = state == PipelineState.idle || state == PipelineState.listening;
227
228 return GestureDetector(
229 onTap: isActive ? onPressed : null,
230 child: Container(
231 width: 80,
232 height: 80,
233 decoration: BoxDecoration(
234 shape: BoxShape.circle,
235 color: state == PipelineState.idle ? Colors.blue : Colors.red,
236 ),
237 child: Icon(
238 state == PipelineState.idle ? Icons.mic : Icons.stop,
239 size: 36,
240 color: Colors.white,
241 ),
242 ),
243 );
244 }
245}

Best Practices

1. Preload Models During Onboarding

dart
1// Download and load all models sequentially
2await modelService.downloadAndLoadLLM();
3await modelService.downloadAndLoadSTT();
4await modelService.downloadAndLoadTTS();

2. Handle Memory Pressure

dart
1// Unload when not needed
2await RunAnywhere.unloadModel();
3await RunAnywhere.unloadSTTModel();
4await RunAnywhere.unloadTTSVoice();

3. Audio Format Summary

ComponentSample RateFormatChannels
Recording16,000 HzInt161
Whisper STT16,000 HzInt161
Piper TTS Output22,050 HzFloat321
Audio PlaybackAnyWAV/Int161-2

Always match audio formats!

4. Prevent Concurrent Operations

dart
1Future<void> start() async {
2 if (_state != PipelineState.idle) return; // Prevent double-starts
3 // ...
4}

5. Tune VAD for Your Environment

The default thresholds work for quiet environments. Adjust for noisy settings:

dart
1static const double speechThreshold = 0.05; // Higher for noisy environments
2static const double silenceThreshold = 0.02; // Higher for noisy environments
3static const double silenceDuration = 2.0; // Longer pause tolerance

6. Check Model State Before Operations

dart
1bool get isVoiceAgentReady {
2 return RunAnywhere.isModelLoaded &&
3 RunAnywhere.isSTTModelLoaded &&
4 RunAnywhere.isTTSVoiceLoaded;
5}

Models Reference

TypeModel IDSizeNotes
LLMlfm2-350m-q4_k_m~250MBLiquidAI, fast, efficient
STTsherpa-onnx-whisper-tiny.en~75MBEnglish
TTSvits-piper-en_US-lessac-medium~65MBUS English

Completed Voice Assistant screen

Voice Assistant app fully set up with conversation bubbles, speaking status, and audio playback
Voice Assistant app fully set up with conversation bubbles, speaking status, and audio playback

Conclusion

You've built a complete voice assistant that:

  • Listens with automatic speech detection
  • Transcribes using on-device Whisper
  • Thinks with a local LLM
  • Responds with natural TTS

All processing happens on-device. No data ever leaves the phone. No API keys. No cloud costs. And it works identically on both iOS and Android.

This is the future of private, cross-platform AI applications.


Complete Source Code

The full source code is available on GitHub:

Flutter Starter App

Includes:

  • Complete Flutter app with all features
  • Provider-based state management
  • Platform-specific audio handling
  • Reusable components and design system

Built-in VoiceSession API

For a higher-level API, RunAnywhere also provides a built-in VoiceSession that handles the full pipeline with events:

dart
1final session = await RunAnywhere.startVoiceSession(
2 config: VoiceSessionConfig(
3 autoDetectSilence: true,
4 silenceThreshold: 1.5,
5 ),
6);
7
8session.events.listen((event) {
9 if (event is VoiceSessionTranscribed) {
10 debugPrint('User said: ${event.text}');
11 } else if (event is VoiceSessionResponded) {
12 debugPrint('AI response: ${event.text}');
13 }
14});

This is useful when you want the SDK to manage the full STT, LLM, and TTS pipeline for you without implementing each step manually.


Resources


Questions? Open an issue on GitHub or reach out on Twitter/X.

Frequently Asked Questions

"Models not loaded" when I tap to start—what do I need to do?

In what order should I load the LLM, STT, and TTS models?

AudioRecordingService has no getAmplitude()—how do I implement VAD?

VAD stops too early or keeps recording after I stop speaking.

I hear no audio during "Playing audio response...".

Where does the provider package come from?

Does the voice pipeline work on simulator/emulator?

How much RAM does the full pipeline use with all three models?

The ttsResult.duration causes an error—what is the correct property?

RunAnywhere Logo

RunAnywhere

Connect with developers, share ideas, get support, and stay updated on the latest features. Our Discord community is the heart of everything we build.

Company

Copyright © 2025 RunAnywhere, Inc.