Higgs Audio v3 Instruct

Features

Understands speech and text — reasons over prosody, emotion, speaker, and content together instead of transcribing first.
Follows instructions — holds the instruction frame across turns, even mid-conversation.
Audio-native tool calling — uses filler speech to hide tool latency, handles async results, and cleanly cancels or ignores stale work.
Interruption-aware — tracks conversational state and resumes cleanly when the user cuts in.
Intelligent text output — supports multi-speaker routing and turn-by-turn state tracking for grounded, decision-ready replies.
Multi-turn — sustains context across long calls and multi-step workflows without losing the thread.

Higgs Audio Instruct will be available soon via a hosted API and a Voice-Agent SDK.

⌘I