Why does someone else's AI voiceover sound full of soul and feeling, while yours sounds like a robot reading a script — broken, lifeless? The difference isn't the tool. It's three things: picking the right voice, writing the right script, and using punctuation properly. We just added 4 new lifelike voices, and put together the full workflow from script to finished video, with scriptwriting tips that bring your characters to life.
Meet the 4 New Lifelike Voices
Sora (Cantonese, Female)

Fly (Cantonese, Male)

Xiao Q (Mandarin, Female)

Zhen Dong (Mandarin, Male)

Want to preview all 17 voices? Head over to the Voice Studio and listen instantly.
From Text to Video: The Full Voiceover Workflow
Just three decisions to go from script to finished piece.
Decision 1 — Pick the Voice
A. Use Your Own Voice (Voice Cloning)
Upload a 30-second to 3-minute voice sample, and we generate a personal voice model. Every future script outputs in your own voice. Best for personal brands, long-running YouTubers, podcasters. Read the Voice Cloning Complete Guide or try the Voice Clone Demo directly.
B. Pick a Professional Voice
The 4 new voices plus 13 existing ones make 17 professional voiceover voices in total, covering Cantonese, Mandarin, English, and Japanese. Pick and go — no training required.
Decision 2 — Pick the Output Format
A. Text to Speech (.mp3 audio)
Pure audio output. Bring your own visuals (vlog, b-roll, tutorial screen recording), drop the mp3 into Premiere, Final Cut, or CapCut.
B. Text to Video (.mp4 video)
We auto-generate captions and a visual backdrop — output is a finished video ready to upload, perfect for shorts and social reels. See The Real Story of Making Ghost Tales with Text to Video.
Decision 3 — Where It Goes
- Podcast audio
- YouTube voice-over (narration / b-roll dub)
- Tutorial videos
- Ad and brand shorts
Key point: none of these use cases need lip-sync. The voiceover sits as an audio layer on top of the visuals — it's not a talking-head video, so AI voiceover is a perfect fit.
Why Not Just Record Yourself?
| Recording yourself | Hey Subtitle AI voiceover |
|---|---|
| Forgetting lines, fumbling words, hoarse throat | One take, no retakes |
| Re-record every script revision | Regenerate instantly when text changes |
| Edit out every 'um' and 'uh' | Not needed |
| Mic, quiet room, time blocked off | Just your computer |
| Fatigue after many takes | Zero emotional drain |
Scriptwriting Tips: Bringing Soul into Every Line
AI voiceover isn't plug-and-play one-take success. Each voice has its own personality and rhythm — knowing how to write is what unlocks its full potential.
Punctuation is the emotion switch
Comma (,) controls pause, period (.) gives breathing room, question mark (?) lifts the ending tone, exclamation (!) emphasises, tilde (~) adds a playful drag. Same sentence, different punctuation, completely different feel.
Spacing controls rhythm
Spaces between words signal slight pauses to the voiceover engine. Add space before and after key terms to spotlight important info.
Paragraph breaks trigger deep breaths
Long paragraphs split across lines, and the engine automatically inserts deeper breathing pauses — making longer passages sound more natural.
Comparison demo
- Flat: The weather is so nice today let's go hiking
- With punctuation: The weather is so nice today, let's go hiking!
- Advanced: The weather~ is so nice today! Let's... go hiking!
Free minutes are there so you can write several versions, compare, and pick the most natural one as your final. Each try teaches you more about each voice's personality.
