Voice Activity Detection (VAD) Settings
These settings are accessed from the Advanced Settings → Developer Settings Tab
Voice Activity Detection (VAD)
This block allows fine-tuning of how Rapport detects when a user starts and stops speaking.
- VAD activation threshold (ms) – How long the user must be speaking before the system recognises it as speech. 
- VAD deactivation threshold (ms) – How long a pause must last before the system decides speech has ended. 
 VAD buffer length (s) – Maximum duration of speech in a single transcription block. Also affects Push-to-Talk behaviour. Max setting is 3 seconds
Experimenting with VAD Settings
The optimal thresholds depend on your use case. Experimentation is encouraged, and validation values exist on the fields to prevent invalid inputs.
Example Behaviour
| Setting | Low Value Example | High Value Example | 
|---|---|---|
| Activation Threshold (ms) | 200ms → Short phrases like “Hi” are recognised and transcribed quickly | 2000ms → Short phrases ignored; only longer utterances (e.g. 2s+) are transcribed | 
| Deactivation Threshold (ms) | 200ms → Each pause causes text to be transcribed in smaller chunks | 2000ms → Longer pauses required; whole speech appears as one block, with more delay | 
✅ Tip: Start with default values, then adjust incrementally to balance responsiveness (low thresholds) and completeness of transcription (high thresholds).