⚡ Speech Gen | Sorceress Wiki | Sorceress

Open Speech Gen

Speech Gen is Sorceress’s text-to-speech and voice cloning tool for creating narration, character dialogue, trailers, tutorials, devlogs, and other spoken audio for games. Type a script, choose a preset or cloned voice, adjust delivery settings, then generate an MP3 speech clip you can preview, organize, rename, download, or delete.

What it does

Speech Gen converts written text into spoken narration. It includes:

Preset male and female voices
Optional voice cloning from an uploaded or recorded sample
HD and Turbo generation modes
Speed, pitch, and emotion controls
Support for spoken interjections such as (laughs), (sighs), (coughs), (gasps), and similar cues
A saved narration gallery with playback, waveform previews, download, renaming, deletion, search, bulk selection, and project organization

Signed-out visitors can view demo narrations. Signing in is required to generate new speech, clone voices, and save your own results.

Typical workflow

Open Speech Gen.
Sign in.
Enter your narration or dialogue in the Text box.
Choose a Voice.
Choose HD or Turbo.
Adjust Speed, Pitch, and Emotion if needed.
Click Generate Speech.
Preview the result from the narration gallery.
Rename, download, delete, or move the narration into a project.

Speech generation can run while you continue working, so you can submit another line or variation without waiting for the previous one to finish.

Generate speech

The generation controls are in the right sidebar.

Step-by-step

In Text, enter the line, paragraph, or script you want spoken.
Watch the character counter to stay within the limit.
Select a Voice:
- Your completed cloned voices appear under My Cloned Voices when available.
- Built-in voices are grouped under Male Voices and Female Voices.
Select a Model:
- HD for higher-definition audio quality.
- Turbo for faster generation.
In Settings, optionally adjust:
- Speed
- Pitch
- Emotion
Review the estimate shown under the settings.
Click Generate Speech.

After you click generate, a new narration card appears in the gallery. While it is being created, the card displays Generating speech.... When complete, the card updates with a waveform, duration, playback button, download button, and delete button.

If you are signed out, the generate button is replaced by Sign in to generate.

Text limits and estimated length

The text box accepts up to 10,000 characters per generation.

Speech Gen shows:

Current character count
An estimate for the generation
Approximate narration length

As a rough estimate, the tool treats about 667 characters as approximately 1 minute of narration. Actual timing may vary depending on punctuation, voice, speed setting, emotion, and delivery style.

Voices

Built-in voices

Speech Gen includes preset voices grouped by gender.

Male voices:

Deep Voice Man
Casual Guy
Patient Man
Young Knight
Determined Man
Decent Boy
Imposing Manner
Elegant Man
Friendly Person

Female voices:

Wise Woman
Calm Woman
Inspirational Girl
Lively Girl
Lovely Girl
Abbess
Sweet Girl
Exuberant Girl

Cloned voices

Cloned voices appear in the left sidebar under My Cloned Voices and, once ready, in the voice selector under My Cloned Voices. A cloned voice can be reused for future speech generations until you delete it.

In the My Cloned Voices panel, you can:

Expand or collapse the list
Select a completed cloned voice for generation
Preview a cloned voice when a preview is available
Rename a cloned voice
Copy the voice ID for your own reference
Delete a cloned voice

Voices that are still being created show Cloning in progress.... Failed clones show Clone failed and can be deleted.

If you delete the cloned voice currently selected for generation, Speech Gen automatically switches back to the first built-in preset voice.

Clone a voice

Voice cloning creates a reusable custom voice from an audio sample. Use only voices you have the right to use, and choose a clean sample with clear speech.

Voice sample requirements

Voice samples can be uploaded as:

The clone panel also displays these requirements and behaviors:

Recommended sample length: at least 30 seconds for best results
Accepted sample range shown in the tool: 10 seconds to 5 minutes
Recordings or uploads longer than the limit are automatically trimmed to 4:59
Files are automatically converted to MP3 when needed
Maximum processed file size: 20 MB
Clear audio with minimal background noise is best

Clone from an uploaded file

In the left sidebar, expand Clone a Voice.
Enter a name in Name this voice.... If you focus the field while it is empty, Speech Gen suggests a simple name such as “My Voice 1.”
Click Upload voice sample.
Choose an MP3, M4A, or WAV file.
Wait while the tool analyzes the audio. If needed, it may convert the file to MP3 or trim it to the maximum duration.
Optional: enable Noise reduction.
Optional: enable Volume normalization.
Click Clone Voice.
The new voice appears in My Cloned Voices while it is being created.
When complete, it becomes selectable for speech generation.

If the uploaded file name is available and you have not already typed a voice name, Speech Gen uses the file name as a starting voice name, with underscores and hyphens replaced by spaces.

Record a voice sample

Speech Gen includes a full-screen voice recording studio with a teleprompter script.

In Clone a Voice, enter a voice name or leave the name empty and let Speech Gen suggest one.
Click Record your voice.
The Voice Recording Studio opens.
Read the on-screen script aloud at a natural pace.
Click Start Recording.
Speak clearly and consistently.
Click Stop Recording when finished.
- Recording automatically stops at 4:59.
Wait while the recording is converted.
Optional: enable Noise reduction and/or Volume normalization.
Click Clone Voice.

The teleprompter script is intentionally long and varied. You do not have to read the entire script, but longer clean samples generally produce better cloned voices than very short clips.

Recording controls

In the recording studio:

The top bar shows the recording timer while recording.
The timer displays progress up to 4:59.
Start Recording begins microphone capture.
Stop Recording ends the recording.
The close button exits the recording studio. If you close while recording, the recording is stopped.
A note at the bottom recommends aiming for at least 30 seconds and says longer is better up to the tool limit.

If your browser blocks microphone access, allow microphone permission for the site and try again.

Generation settings

Model

Speech Gen offers two generation modes:

| Model | Best for | In-tool description | |---|---|---| | HD | Final narration, important dialogue, trailers, polished voiceover | Higher-quality / high-definition audio quality | | Turbo | Drafts, rapid iteration, quick variations | Faster generation |

Speed

Controls how quickly the line is delivered.

Range: 0.5x to 2.0x
Default: 1.0x
Lower values produce slower speech
Higher values produce faster speech

Pitch

Controls the voice pitch.

Range: -12 to +12
Default: 0
Negative values lower the pitch
Positive values raise the pitch

Emotion

Controls the intended emotional delivery.

Options:

Neutral
Happy
Calm
Sad
Angry
Fearful
Disgusted
Surprised

Emotion can help shape a performance, but results vary depending on the selected voice and the text. For best results, use punctuation and wording that support the emotion you choose.

Interjections

The text box notes support for parenthetical interjections, including:

(laughs)
(sighs)
(coughs)
(gasps)

Use interjections sparingly where a performance cue is needed. Too many cues can make a line sound unnatural.

Work with generated narrations

Completed narrations appear as cards in the center gallery. Each card can show:

Narration name
Voice name
Model used
Text preview
Waveform display
Duration
Playback controls
Download control
Delete control

Narration cards are automatically named from the beginning of the generated text. You can rename them after generation.

Play audio

Click Play on a completed narration card to listen. While it is playing, the button changes to Stop.

You can also click a point on the waveform to start playback from that position. The waveform highlights playback progress and shows a duration below it once the waveform has loaded.

Download audio

Click the download button on a narration card to save the speech clip as an MP3 file.

In selection mode, you can select multiple completed narrations and click Download to download the selected clips.

Rename a narration

Click the narration title on its card.
Type the new name.
Press Enter or click away to save.
Press Escape to cancel while editing.

Delete a narration

Click the trash button on a narration card to remove it. Failed generations include a Remove button on the failed-state card.

To delete multiple completed narrations:

Click Select above the gallery.
Click the narrations you want to delete.
Click Delete.
Confirm the deletion.

Deletion cannot be undone.

Organize with projects

Speech Gen includes project tabs for organizing narration clips by game, character, scene, chapter, or workflow stage.

Project tabs

All Narrations shows every narration.
Custom project tabs show narrations assigned to that project.

Selecting a project also clears the current search query, making it easier to focus on that project’s contents.

Create a project

Click the folder-plus button beside the project tabs. A new project is created with a default numbered name and selected automatically.

Rename a project

Select the project tab.
Click the pencil icon on the active tab.
Type the new name.
Press Enter or click away to save.
Press Escape to cancel while editing.

Move narrations between projects

Completed narration cards can be dragged onto project tabs.

Drag a completed narration card onto a project tab to assign it to that project.
Drag it onto All Narrations to remove it from a project.

Only completed narration cards are draggable.

Delete a project

Select the project tab.
Click the trash icon on the active tab.
Choose one of the options:
- Keep narrations (move to All): deletes the project but keeps its narrations.
- Delete narrations too: deletes both the project and the narrations inside it.
Confirm or cancel.

Search and selection

When you have narrations in the gallery, Speech Gen shows a search bar above the cards.

Search filters narrations by:

Narration name
Text content
Voice name

Click the X in the search field to clear the search.

Click Select to enter selection mode. Selection mode supports completed narrations and lets you:

See how many narrations are selected
Download selected narrations
Delete selected narrations

Click Select again to exit selection mode and clear the current selection.

Tips for better speech

Write punctuation intentionally. Commas, periods, line breaks, and sentence length can affect pacing.
Generate long scripts in sections if you want easier review, replacement, and organization.
Use HD for final narration, trailers, important dialogue, or voiceover.
Use Turbo when you need quick drafts or variations.
Try neutral emotion first, then regenerate with a specific emotion if the line needs stronger delivery.
Use interjections only where they add value.
For character dialogue, generate several short takes rather than one very long passage.
For cloned voices, record in a quiet room with minimal echo.
Keep a consistent distance from the microphone when recording a clone sample.
Avoid background music, overlapping speakers, heavy reverb, wind, or room noise in clone samples.
A clean one-to-five-minute sample usually works better than a noisy or inconsistent sample.

Troubleshooting

You must be signed in to generate speech, clone voices, and save your own results. Sign in, then return to Speech Gen.

My generation is still processing

Speech generation can take a little time. The narration card remains in the gallery while it is being created. You can continue submitting other lines while waiting.

A generation failed

The failed card shows an error message when available. Remove the failed card and try again. If your script is very long, try splitting it into smaller sections.

My uploaded voice file is rejected

Use MP3, M4A, or WAV. If the processed file is too large, shorten the sample or export it at a lower bitrate and try again.

The tool says the file is too large after processing

The processed voice sample must be no larger than 20 MB. Use a shorter sample, remove silence, or export at a lower quality setting before uploading again.

Audio processing failed

Try a different source file or manually convert the audio to MP3 before uploading. If the original file is small enough, the tool may still accept it even if automatic processing fails.

Microphone recording does not start

Allow microphone permission in your browser, then try again. If permission was previously blocked, update the site permission in your browser settings and reload Speech Gen.

My recording did not convert

Try recording again, or use the upload option instead. If the issue continues, record with another app, export as MP3, M4A, or WAV, and upload the file.

My cloned voice sounds poor

Clone again with a cleaner sample. Use a quiet room, speak naturally, keep the microphone distance steady, avoid background noise, and aim for at least 30 seconds of clear speech. Longer samples up to the tool limit often produce better results.

Voice preview is unavailable

A cloned voice preview may become unavailable. If the voice is still listed as completed, it may still be usable for speech generation even when preview playback fails.

I cannot drag a narration into a project

Only completed narrations are draggable. Wait for the generation to complete, then drag the card onto the desired project tab.

FAQ

Yes. Signing in is required to generate speech, clone voices, and save your results. Signed-out visitors may see demo content.

What format are downloads?

Generated speech downloads as MP3.

Can I reuse a cloned voice?

Yes. Once a voice clone succeeds, it remains available under My Cloned Voices and in the voice selector until you delete it.

Can I generate multiple clips at once?

You can submit another generation while a previous one is still being created. Each generation appears as its own card in the gallery.

Can I organize voice lines by game or character?

Yes. Create project tabs, rename them, and drag completed narrations into the appropriate project.

Can I record directly in the browser?

Yes. Use Record your voice in the Clone a Voice panel to open the recording studio and read from the teleprompter.

Can I use interjections in the script?

Yes. The text box supports parenthetical cues such as (laughs), (sighs), (coughs), and (gasps). Results depend on the selected voice and phrasing.

Speech Gen