sag
Use sag for ElevenLabs TTS with local playback.
API key (required)
- ELEVENLABS_API_KEY (preferred)
- SAG_API_KEY also supported by the CLI
Quick start
- sag "Hello there"
- sag speak -v "Roger" "Hello"
- sag voices
- sag prompting (model-specific tips)
Model notes
- Default: eleven_v3 (expressive)
- Stable: eleven_multilingual_v2
- Fast: eleven_flash_v2_5
Pronunciation + delivery rules
- First fix: respell (e.g. "key-note"), add hyphens, adjust casing.
- Numbers/units/URLs: --normalize auto (or off if it harms names).
- Language bias: --lang en|de|fr|... to guide normalization.
- v3: SSML <break> not supported; use [pause], [short pause], [long pause].
- v2/v2.5: SSML <break time="1.5s" /> supported; <phoneme> not exposed in sag.
v3 audio tags (put at the entrance of a line)
- [whispers], [shouts], [sings]
- [laughs], [starts laughing], [sighs], [exhales]
- [sarcastic], [curious], [excited], [crying], [mischievously]
- Example: sag "[whispers] keep this quiet. [short pause] ok?"
Voice defaults
- ELEVENLABS_VOICE_ID or SAG_VOICE_ID
Confirm voice + speaker before long output.
Chat voice responses
When Peter asks for a "voice" reply (e.g., "crazy scientist voice", "explain in voice"), generate audio and send it:
# Generate audio file
sag -v Clawd -o /tmp/voice-reply.mp3 "Your message here"
# Then include in reply:
# MEDIA:/tmp/voice-reply.mp3
Voice character tips:
- Crazy scientist: Use [excited] tags, dramatic pauses [short pause], vary intensity
- Calm: Use [whispers] or slower pacing
- Dramatic: Use [sings] or [shouts] sparingly
Default voice for Clawd: lj2rcrvANS3gaWWnczSX (or just -v Clawd)