Speech Gen is Sorceress’s text-to-speech and voice cloning tool for creating narration, character dialogue, trailers, tutorials, devlogs, and other spoken audio for games. Type a script, choose a preset or cloned voice, adjust delivery settings, then generate an MP3 speech clip you can preview, organize, rename, download, or delete.
What it does
Speech Gen converts written text into spoken narration. It includes:
- Preset male and female voices
- Optional voice cloning from an uploaded or recorded sample
- HD and Turbo generation modes
- Speed, pitch, and emotion controls
- Support for spoken interjections such as
(laughs),(sighs),(coughs),(gasps), and similar cues - A saved narration gallery with playback, waveform previews, download, renaming, deletion, search, bulk selection, and project organization
Signed-out visitors can view demo narrations. Signing in is required to generate new speech, clone voices, and save your own results.
Typical workflow
- Open Speech Gen.
- Sign in.
- Enter your narration or dialogue in the Text box.
- Choose a Voice.
- Choose HD or Turbo.
- Adjust Speed, Pitch, and Emotion if needed.
- Click Generate Speech.
- Preview the result from the narration gallery.
- Rename, download, delete, or move the narration into a project.
Speech generation can run while you continue working, so you can submit another line or variation without waiting for the previous one to finish.
Generate speech
The generation controls are in the right sidebar.
Step-by-step
- In Text, enter the line, paragraph, or script you want spoken.
- Watch the character counter to stay within the limit.
- Select a Voice:
- Your completed cloned voices appear under My Cloned Voices when available.
- Built-in voices are grouped under Male Voices and Female Voices.
- Select a Model:
- HD for higher-definition audio quality.
- Turbo for faster generation.
- In Settings, optionally adjust:
- Speed
- Pitch
- Emotion
- Review the estimate shown under the settings.
- Click Generate Speech.
After you click generate, a new narration card appears in the gallery. While it is being created, the card displays Generating speech.... When complete, the card updates with a waveform, duration, playback button, download button, and delete button.
If you are signed out, the generate button is replaced by Sign in to generate.
Text limits and estimated length
The text box accepts up to 10,000 characters per generation.
Speech Gen shows:
- Current character count
- An estimate for the generation
- Approximate narration length
As a rough estimate, the tool treats about 667 characters as approximately 1 minute of narration. Actual timing may vary depending on punctuation, voice, speed setting, emotion, and delivery style.
Voices
Built-in voices
Speech Gen includes preset voices grouped by gender.
Male voices:
- Deep Voice Man
- Casual Guy
- Patient Man
- Young Knight
- Determined Man
- Decent Boy
- Imposing Manner
- Elegant Man
- Friendly Person
Female voices:
- Wise Woman
- Calm Woman
- Inspirational Girl
- Lively Girl
- Lovely Girl
- Abbess
- Sweet Girl
- Exuberant Girl
Cloned voices
Cloned voices appear in the left sidebar under My Cloned Voices and, once ready, in the voice selector under My Cloned Voices. A cloned voice can be reused for future speech generations until you delete it.
In the My Cloned Voices panel, you can:
- Expand or collapse the list
- Select a completed cloned voice for generation
- Preview a cloned voice when a preview is available
- Rename a cloned voice
- Copy the voice ID for your own reference
- Delete a cloned voice
Voices that are still being created show Cloning in progress.... Failed clones show Clone failed and can be deleted.
If you delete the cloned voice currently selected for generation, Speech Gen automatically switches back to the first built-in preset voice.
Clone a voice
Voice cloning creates a reusable custom voice from an audio sample. Use only voices you have the right to use, and choose a clean sample with clear speech.
Voice sample requirements
Voice samples can be uploaded as:
- MP3
- M4A
- WAV
The clone panel also displays these requirements and behaviors:
- Recommended sample length: at least 30 seconds for best results
- Accepted sample range shown in the tool: 10 seconds to 5 minutes
- Recordings or uploads longer than the limit are automatically trimmed to 4:59
- Files are automatically converted to MP3 when needed
- Maximum processed file size: 20 MB
- Clear audio with minimal background noise is best
Clone from an uploaded file
- In the left sidebar, expand Clone a Voice.
- Enter a name in Name this voice.... If you focus the field while it is empty, Speech Gen suggests a simple name such as “My Voice 1.”
- Click Upload voice sample.
- Choose an MP3, M4A, or WAV file.
- Wait while the tool analyzes the audio. If needed, it may convert the file to MP3 or trim it to the maximum duration.
- Optional: enable Noise reduction.
- Optional: enable Volume normalization.
- Click Clone Voice.
- The new voice appears in My Cloned Voices while it is being created.
- When complete, it becomes selectable for speech generation.
If the uploaded file name is available and you have not already typed a voice name, Speech Gen uses the file name as a starting voice name, with underscores and hyphens replaced by spaces.
Record a voice sample
Speech Gen includes a full-screen voice recording studio with a teleprompter script.
- In Clone a Voice, enter a voice name or leave the name empty and let Speech Gen suggest one.
- Click Record your voice.
- The Voice Recording Studio opens.
- Read the on-screen script aloud at a natural pace.
- Click Start Recording.
- Speak clearly and consistently.
- Click Stop Recording when finished.
- Recording automatically stops at 4:59.
- Wait while the recording is converted.
- Optional: enable Noise reduction and/or Volume normalization.
- Click Clone Voice.
The teleprompter script is intentionally long and varied. You do not have to read the entire script, but longer clean samples generally produce better cloned voices than very short clips.
Recording controls
In the recording studio:
- The top bar shows the recording timer while recording.
- The timer displays progress up to 4:59.
- Start Recording begins microphone capture.
- Stop Recording ends the recording.
- The close button exits the recording studio. If you close while recording, the recording is stopped.
- A note at the bottom recommends aiming for at least 30 seconds and says longer is better up to the tool limit.
If your browser blocks microphone access, allow microphone permission for the site and try again.
Generation settings
Model
Speech Gen offers two generation modes:
| Model | Best for | In-tool description | |---|---|---| | HD | Final narration, important dialogue, trailers, polished voiceover | Higher-quality / high-definition audio quality | | Turbo | Drafts, rapid iteration, quick variations | Faster generation |
Speed
Controls how quickly the line is delivered.
- Range: 0.5x to 2.0x
- Default: 1.0x
- Lower values produce slower speech
- Higher values produce faster speech
Pitch
Controls the voice pitch.
- Range: -12 to +12
- Default: 0
- Negative values lower the pitch
- Positive values raise the pitch
Emotion
Controls the intended emotional delivery.
Options:
- Neutral
- Happy
- Calm
- Sad
- Angry
- Fearful
- Disgusted
- Surprised
Emotion can help shape a performance, but results vary depending on the selected voice and the text. For best results, use punctuation and wording that support the emotion you choose.
Interjections
The text box notes support for parenthetical interjections, including:
(laughs)(sighs)(coughs)(gasps)
Use interjections sparingly where a performance cue is needed. Too many cues can make a line sound unnatural.
Work with generated narrations
Completed narrations appear as cards in the center gallery. Each card can show:
- Narration name
- Voice name
- Model used
- Text preview
- Waveform display
- Duration
- Playback controls
- Download control
- Delete control
Narration cards are automatically named from the beginning of the generated text. You can rename them after generation.
Play audio
Click Play on a completed narration card to listen. While it is playing, the button changes to Stop.
You can also click a point on the waveform to start playback from that position. The waveform highlights playback progress and shows a duration below it once the waveform has loaded.
Download audio
Click the download button on a narration card to save the speech clip as an MP3 file.
In selection mode, you can select multiple completed narrations and click Download to download the selected clips.
Rename a narration
- Click the narration title on its card.
- Type the new name.
- Press Enter or click away to save.
- Press Escape to cancel while editing.
Delete a narration
Click the trash button on a narration card to remove it. Failed generations include a Remove button on the failed-state card.
To delete multiple completed narrations:
- Click Select above the gallery.
- Click the narrations you want to delete.
- Click Delete.
- Confirm the deletion.
Deletion cannot be undone.
Organize with projects
Speech Gen includes project tabs for organizing narration clips by game, character, scene, chapter, or workflow stage.
Project tabs
- All Narrations shows every narration.
- Custom project tabs show narrations assigned to that project.
Selecting a project also clears the current search query, making it easier to focus on that project’s contents.
Create a project
Click the folder-plus button beside the project tabs. A new project is created with a default numbered name and selected automatically.
Rename a project
- Select the project tab.
- Click the pencil icon on the active tab.
- Type the new name.
- Press Enter or click away to save.
- Press Escape to cancel while editing.
Move narrations between projects
Completed narration cards can be dragged onto project tabs.
- Drag a completed narration card onto a project tab to assign it to that project.
- Drag it onto All Narrations to remove it from a project.
Only completed narration cards are draggable.
Delete a project
- Select the project tab.
- Click the trash icon on the active tab.
- Choose one of the options:
- Keep narrations (move to All): deletes the project but keeps its narrations.
- Delete narrations too: deletes both the project and the narrations inside it.
- Confirm or cancel.
Search and selection
When you have narrations in the gallery, Speech Gen shows a search bar above the cards.
Search filters narrations by:
- Narration name
- Text content
- Voice name
Click the X in the search field to clear the search.
Click Select to enter selection mode. Selection mode supports completed narrations and lets you:
- See how many narrations are selected
- Download selected narrations
- Delete selected narrations
Click Select again to exit selection mode and clear the current selection.
Tips for better speech
- Write punctuation intentionally. Commas, periods, line breaks, and sentence length can affect pacing.
- Generate long scripts in sections if you want easier review, replacement, and organization.
- Use HD for final narration, trailers, important dialogue, or voiceover.
- Use Turbo when you need quick drafts or variations.
- Try neutral emotion first, then regenerate with a specific emotion if the line needs stronger delivery.
- Use interjections only where they add value.
- For character dialogue, generate several short takes rather than one very long passage.
- For cloned voices, record in a quiet room with minimal echo.
- Keep a consistent distance from the microphone when recording a clone sample.
- Avoid background music, overlapping speakers, heavy reverb, wind, or room noise in clone samples.
- A clean one-to-five-minute sample usually works better than a noisy or inconsistent sample.
Troubleshooting
I see “Sign In Required” or only a sign-in button
You must be signed in to generate speech, clone voices, and save your own results. Sign in, then return to Speech Gen.
My generation is still processing
Speech generation can take a little time. The narration card remains in the gallery while it is being created. You can continue submitting other lines while waiting.
A generation failed
The failed card shows an error message when available. Remove the failed card and try again. If your script is very long, try splitting it into smaller sections.
My uploaded voice file is rejected
Use MP3, M4A, or WAV. If the processed file is too large, shorten the sample or export it at a lower bitrate and try again.
The tool says the file is too large after processing
The processed voice sample must be no larger than 20 MB. Use a shorter sample, remove silence, or export at a lower quality setting before uploading again.
Audio processing failed
Try a different source file or manually convert the audio to MP3 before uploading. If the original file is small enough, the tool may still accept it even if automatic processing fails.
Microphone recording does not start
Allow microphone permission in your browser, then try again. If permission was previously blocked, update the site permission in your browser settings and reload Speech Gen.
My recording did not convert
Try recording again, or use the upload option instead. If the issue continues, record with another app, export as MP3, M4A, or WAV, and upload the file.
My cloned voice sounds poor
Clone again with a cleaner sample. Use a quiet room, speak naturally, keep the microphone distance steady, avoid background noise, and aim for at least 30 seconds of clear speech. Longer samples up to the tool limit often produce better results.
Voice preview is unavailable
A cloned voice preview may become unavailable. If the voice is still listed as completed, it may still be usable for speech generation even when preview playback fails.
I cannot drag a narration into a project
Only completed narrations are draggable. Wait for the generation to complete, then drag the card onto the desired project tab.
FAQ
Do I need to sign in?
Yes. Signing in is required to generate speech, clone voices, and save your results. Signed-out visitors may see demo content.
What format are downloads?
Generated speech downloads as MP3.
Can I reuse a cloned voice?
Yes. Once a voice clone succeeds, it remains available under My Cloned Voices and in the voice selector until you delete it.
Can I generate multiple clips at once?
You can submit another generation while a previous one is still being created. Each generation appears as its own card in the gallery.
Can I organize voice lines by game or character?
Yes. Create project tabs, rename them, and drag completed narrations into the appropriate project.
Can I record directly in the browser?
Yes. Use Record your voice in the Clone a Voice panel to open the recording studio and read from the teleprompter.
Can I use interjections in the script?
Yes. The text box supports parenthetical cues such as (laughs), (sighs), (coughs), and (gasps). Results depend on the selected voice and phrasing.