Browse the wiki

Sorceress Tool API

Updated July 4, 2026Open the tool

The Sorceress Tool API lets external apps, automation, and agent systems use selected Sorceress creation tools with your Sorceress account. It is designed for headless workflows: your app can discover available tools, start a tool run, and, for longer generations, check back until the finished asset is ready.

Open the live Tool API guide here: https://sorceress.games/tools-guide

What it does

The Tool API exposes a catalog of Sorceress tools that authenticated callers can discover and use. It currently supports:

  • Connectivity testing with Ping
  • Text-to-image generation across Sorceress image models
  • Sound effect generation from text prompts
  • Text-to-speech generation with preset and cloned voices
  • Voice listing for speech generation
  • Music generation with simple or advanced song controls

Most creative tools run asynchronously. In practice, that means your integration starts a generation, receives a job ID, and checks that job until the finished image or audio is available. Finished assets are returned as stable URLs and are saved into your Sorceress generation library when possible.

Who it is for

Use the Tool API when you want Sorceress generation features outside the normal website UI, such as:

  • A custom game-building pipeline that creates assets automatically.
  • A private tool that generates images, voices, music, or sound effects for your team.
  • An AI agent that needs access to Sorceress creation tools.
  • A backend service that prepares assets for games, prototypes, or content libraries.

Do not put Tool API keys in public browser code, downloadable game clients, or any place where players or visitors can inspect them. The API is meant to be called from trusted software you control.

Requirements

To use the Tool API, you need:

  1. A Sorceress account.
  2. A Sorceress API key created from the website.
  3. A client or integration capable of sending authenticated JSON requests.
  4. Access to the tool IDs and input parameters from the live Tool API guide or discovery response.

API keys are account-bound. A key can only access tools and jobs for the Sorceress account that created it. Some keys may also be scoped to specific tools; scoped keys can only use the tools included in their scope.

Authentication and API key safety

Tool discovery, tool invocation, and job checking require a Sorceress API key. Your integration sends the key as an authorization bearer token.

If the key is missing, mistyped, revoked, or used for the wrong account, the API returns an authentication or authorization error.

Treat API keys like passwords:

  • Store them securely in server-side environment or secret storage.
  • Do not commit them to source control.
  • Do not embed them in public client-side code.
  • Revoke keys you no longer use.
  • Use separate keys for separate apps, environments, or agents when possible.
  • Prefer scoped keys when an integration only needs a small set of tools.

Managing API keys

API keys are created from the Sorceress website while you are logged in. Key-management actions use your logged-in website session rather than an existing Tool API key, so you can bootstrap your first key from the site.

Available key-management actions are:

  • List keys — view your existing keys as display records. The full secret value is not shown after creation.
  • Create key — create a new key with an optional name. The raw secret is shown once.
  • Revoke key — delete one of your own keys so it can no longer be used.

Because raw keys are shown only once, copy the key immediately when you create it. If you lose it, revoke the old key and create a new one.

Discovering available tools

Use tool discovery to retrieve the current catalog before building a UI or agent tool list. Discovery returns both available tools and roadmap entries, so integrations can show what is live and what is planned.

Only tools that are both enabled and live can be invoked. A planned or disabled tool may appear in the catalog but will not run.

Each tool descriptor can include:

  • Tool ID
  • Label and description
  • Status: planned, building, or live
  • Whether the tool is enabled
  • Cost note or cost metadata, when relevant
  • Input parameters
  • Optional model details, supported aspect ratios, and option information

For image generation, discovery is especially important because available models, aspect ratios, reference-image support, and model-specific options can vary.

Invoking tools

To run a tool, send JSON input for the selected tool. Inputs can be provided directly or wrapped inside an input object, depending on your integration style.

A successful synchronous tool returns its result immediately. A successful asynchronous generation tool returns a job ID and an initial processing state. Your app should store the job ID and check it until the final asset is ready.

Common response fields include:

  • ok — whether the request was accepted or completed successfully.
  • tool — the tool that handled the request.
  • data — result data, often including a job ID for asynchronous tools.
  • error or message — failure information when something goes wrong.

For asynchronous tools, the final job response can include image, audio, or music track URLs.

Checking asynchronous jobs

Image, sound effect, speech, and music generation usually return a job ID rather than the final asset immediately. Check the job periodically until it reaches a final result.

Typical states are:

  • processing — the generation is still running; check again later.
  • succeeded — the asset is ready and result URLs are included.
  • failed — the generation could not be completed; an error message is included.

Successful image jobs return image assets and a convenience URL for the first image. Successful sound effect and speech jobs return a single audio asset. Successful music jobs return multiple tracks when variations are available, plus a convenience URL for the first track.

Avoid checking the same job extremely aggressively. A short delay between checks is friendlier to your integration and to Sorceress.

Live tools

Ping

Tool ID: ping
Mode: Synchronous

Ping verifies that authentication, discovery, and invocation are working. It can echo an optional message and returns basic caller information plus the server time.

Input:

| Parameter | Type | Required | Description | | --- | --- | --- | --- | | message | string | No | Optional text to echo back. |

Use Ping first when setting up a new integration. If Ping works, your key and basic request format are correct.

Image Generation

Tool ID: image_generate
Mode: Asynchronous

Image Generation creates images from a text prompt using Sorceress image models. The current model catalog, supported aspect ratios, reference-image limits, and model options are returned by tool discovery.

Input:

| Parameter | Type | Required | Description | | --- | --- | --- | --- | | model | string | Yes | Model key. Must be one of the discovered image model IDs. | | prompt | string | Yes | Text description of the image to generate. | | aspectRatio | string | No | Desired aspect ratio, such as 16:9, 1:1, or 9:16. Must be supported by the chosen model. | | params | object | No | Model-specific options such as quality, resolution, or size. Options vary by model. | | refImages | array | No | Reference image URLs for models that support image input. Limits vary by model. |

Notes:

  • The available Sorceress image model list is discoverable and may change over time.
  • Some models support reference images; others do not.
  • Some model options affect output style, size, or quality.
  • Some models may return more than one image from a single generation.
  • If you request an unsupported aspect ratio, unsupported reference image input, or too many reference images, the call will return an input error.

Recommended image workflow:

  1. Use discovery to list image models and options.
  2. Let the user or agent choose a model, aspect ratio, prompt, and optional model settings.
  3. Start the image generation.
  4. Store the returned job ID.
  5. Check the job until it succeeds or fails.
  6. Use the returned image URL or open the saved result in the Sorceress generation library.

Sound Effects

Tool ID: sfx_generate
Mode: Asynchronous

Sound Effects generates a short MP3 sound effect from a text description.

Input:

| Parameter | Type | Required | Description | | --- | --- | --- | --- | | prompt | string | Yes | Description of the sound effect, such as “heavy wooden door creaking open”. Maximum 500 characters. | | loop | boolean | No | Set to true to request a seamlessly loopable sound, useful for ambience. Default is false. |

Notes:

  • Prompts should be concise and specific.
  • Duration is controlled by the generation system and is intended for short effects.
  • If the prompt is longer than the supported limit, it is shortened before generation.
  • Use loop for ambience beds, engine hums, wind, rain, magical auras, or similar repeating sounds.

Good SFX prompts usually include the object, action, material, and style, for example:

  • “8-bit coin pickup, bright arcade chime”
  • “heavy wooden door creaking open in a stone dungeon”
  • “short magical shield impact, shimmering glass and low bass thump”

Speech / Text-to-Speech

Tool ID: speech_generate
Mode: Asynchronous

Speech converts text into spoken MP3 audio using either a preset voice or one of your cloned voices. Use the voice-listing tool first to get valid voice IDs.

Input:

| Parameter | Type | Required | Description | | --- | --- | --- | --- | | text | string | Yes | Text to speak. Maximum 10,000 characters. | | voice_id | string | Yes | A preset voice ID or one of your cloned voice IDs. Use the voice-listing tool first. | | speed | number | No | Speech rate from 0.5 to 2.0. Default is 1. | | pitch | number | No | Pitch from -12 to 12. Default is 0. | | emotion | string | No | One of none, happy, calm, sad, angry, fearful, disgusted, or surprised. Default is none. | | volume | number | No | Volume multiplier. Default is 1. | | language_boost | string | No | Optional language hint to improve pronunciation, such as “English” or “Spanish”. |

Speech tips:

  • Keep lines natural and readable. TTS performs better with punctuation.
  • Use shorter text blocks when you need fine control over pacing or retakes.
  • Choose emotion deliberately; extreme emotion settings can change delivery noticeably.
  • Use language_boost when pronunciation matters, especially for non-English or mixed-language lines.
  • If you change speed, test in-game so dialogue still fits your timing.

List Speech Voices

Tool ID: speech_list_voices
Mode: Synchronous

List Speech Voices returns the voices available to your account for text-to-speech generation. It includes preset voices and your own successful cloned voices.

Returned voice information includes:

  • Voice ID
  • Display name
  • Voice type: preset or clone
  • Gender for preset voices when available
  • Creation time for cloned voices when available
  • Supported emotion values

Preset voices currently include 17 options:

  • Deep Voice Man
  • Casual Guy
  • Patient Man
  • Young Knight
  • Determined Man
  • Decent Boy
  • Imposing Manner
  • Elegant Man
  • Friendly Person
  • Wise Woman
  • Calm Woman
  • Inspirational Girl
  • Lively Girl
  • Lovely Girl
  • Abbess
  • Sweet Girl
  • Exuberant Girl

Use the returned voice ID exactly when calling Speech. Cloned voices are account-specific, so one account’s cloned voice IDs are not available to another account.

Music Generation

Tool ID: music_generate
Mode: Asynchronous

Music Generation creates a song from a description or, in advanced mode, from custom lyrics plus style controls. A successful generation returns approximately two MP3 variations when available.

Input:

| Parameter | Type | Required | Description | | --- | --- | --- | --- | | prompt | string | No | In simple mode, a song description. In advanced mode, the lyrics. Leave empty for auto-lyrics or instrumental when style is provided. | | customMode | boolean | No | False for simple mode; true for advanced mode. Default is false. | | instrumental | boolean | No | True for no vocals. Default is false. | | model | string | No | Model version. Options: V5_5, V5, V4_5PLUS, V4_5, V4. Default is V5_5. | | style | string | No | Advanced mode only. Musical style or genre tags, such as “epic orchestral, cinematic”. | | title | string | No | Advanced mode only. Song title, maximum 80 characters. | | negativeTags | string | No | Advanced mode only. Styles to avoid. | | vocalGender | string | No | Advanced mode only. Use m or f to bias the vocal. | | weirdnessConstraint | number | No | Advanced mode only. 0–1 creativity/weirdness control. | | styleWeight | number | No | Advanced mode only. 0–1 strength of style following. |

Generation requires at least a prompt, or in advanced mode a style.

Simple mode examples:

  • “loopable fantasy tavern folk song with lute, hand drum, and warm crowd energy”
  • “dark cyberpunk combat music, aggressive synth bass, 140 BPM, no vocals”
  • “peaceful farming village theme, acoustic guitar, soft flute, cozy and nostalgic”

Advanced mode is useful when you want to provide lyrics, a title, and tighter style tags. Use instrumental: true for vocal-free tracks.

Planned tools

Discovery may show planned tools that are not currently invocable. AutoSprite currently appears as a planned, disabled Tool API entry. It is listed so integrations can display roadmap information, but attempts to run disabled or non-live tools will return an availability error.

Common workflows

Set up a new integration

  1. Log in to Sorceress.
  2. Create an API key and copy the raw key immediately.
  3. Store the key securely in your integration.
  4. Run Ping to verify authentication.
  5. Use discovery to fetch the current tool catalog.
  6. Enable only the tools your integration needs.
  7. For generation tools, implement job checking and final asset handling.

Generate an image for a game asset pipeline

  1. Discover image models and choose one that supports the desired aspect ratio and reference-image behavior.
  2. Send a prompt, model ID, aspect ratio, and any model-specific options.
  3. Store the returned job ID.
  4. Check the job until it succeeds.
  5. Save the returned image URL in your asset database, content pipeline, or game editor.
  6. If the result is not usable, adjust the prompt or settings and generate again.

Generate dialogue audio

  1. Call List Speech Voices.
  2. Choose a preset or cloned voice ID.
  3. Split long scripts into manageable lines or scenes.
  4. Generate speech for each line.
  5. Check each job until the MP3 is ready.
  6. Review pacing, pronunciation, and emotion in context.
  7. Regenerate lines that need timing or delivery changes.

Generate a music cue

  1. Decide whether you want simple description mode or advanced custom mode.
  2. For background music, consider instrumental: true.
  3. Provide style tags that match the game scene.
  4. Start generation and check the job.
  5. Review the returned variations and choose the best fit.
  6. Save the selected track URL or download it for editing.

Evaluating results

For generated assets, evaluate the result in the context where it will be used:

  • Images: Check composition, aspect ratio, style consistency, and whether important details are visible at game resolution.
  • Sound effects: Test volume, attack, tail length, and whether the sound reads clearly during gameplay.
  • Speech: Listen for pronunciation, emotion, pacing, and line timing.
  • Music: Test the cue under gameplay or scene audio, not just in isolation.

If results miss the target, revise the prompt with more concrete constraints rather than only adding adjectives. Mention genre, mood, materials, camera/composition, instruments, or gameplay use case as appropriate.

Common errors and how to fix them

Unauthorized

The request is missing a valid Sorceress API key, the key was typed incorrectly, or it has been revoked.

Fix: create or copy a valid API key and send it as a bearer token from trusted server-side code.

Forbidden

The key belongs to another account for the requested job, or the key is scoped and does not include the requested tool.

Fix: use the correct account key, or create a key with access to the needed tool.

Unknown tool or job

The tool ID or job ID does not exist.

Fix: use discovery for tool IDs and store job IDs returned by generation calls.

Tool not available

The tool exists but is not enabled or is not live.

Fix: only invoke tools marked live and enabled in discovery.

Subscription required

Your account is authenticated, but the tool is not available for your current account access state.

Fix: log in to the correct account and check your Sorceress membership or access status.

Payment required or insufficient balance

The requested generation could not be started or completed because the account did not have enough available generation balance.

Fix: use a lower-cost option where available, or update your account balance from the Sorceress website.

Invalid input

A required parameter is missing, an option is outside its allowed range, an unsupported aspect ratio was selected, an invalid emotion was provided, text exceeded the supported length, or a reference-image limit was exceeded.

Fix: check the tool’s parameter list and model details from discovery.

Job stays processing

Some generations take longer than others. Temporary provider or network delays can also make a job remain in progress for a while.

Fix: continue checking after a short delay. Avoid extremely aggressive polling. If a job remains unresolved for an unusually long time, try a new generation or contact support with the job ID.

Generated asset is not what I expected

The generation completed, but the result does not match the intended use.

Fix: make the prompt more specific. For images, include subject, style, composition, and aspect-ratio-aware framing. For audio, include source, material, mood, and gameplay purpose. For speech, adjust punctuation, speed, emotion, or voice.

Best practices

  • Start with Ping to verify your integration.
  • Use discovery at startup so your app reflects the current tools, models, options, and statuses.
  • Store job IDs until completion.
  • Treat job checking as repeatable and safe; your app may check the same job more than once.
  • Copy the raw API key immediately when it is created; it cannot be retrieved later.
  • Use separate keys for development, staging, and production.
  • Use scoped keys for agents that only need specific tools.
  • Never expose Sorceress API keys in public browser code or downloadable game clients.
  • Build retry and timeout behavior into your own integration so users get clear feedback.
  • Save returned asset URLs alongside your own project metadata so generated content remains easy to find.

FAQ

Are generated assets temporary?

Finished assets returned by successful jobs are copied into Sorceress-managed storage before being returned. They are also saved to your Sorceress generation library when possible.

Can one API key access another user’s jobs?

No. Jobs are account-bound. A key can only check jobs created by the same account.

Can I use cloned voices?

Yes. Use speech_list_voices to retrieve cloned voice IDs available to your account, then pass one of those IDs to speech_generate.

Why does discovery include tools that cannot be invoked?

Discovery includes planned and disabled tools so integrations can show roadmap information. Only live, enabled tools can be run.

Do all tools return immediately?

Ping and voice listing return immediately. Image, sound effect, speech, and music generation return a job ID first, then provide the final asset when the job succeeds.

Can I choose exact music duration or exact SFX length?

Not currently through the Tool API. Sound effect duration is generation-controlled, and music returns generated track variations based on your prompt and settings.

What should I do if I lose an API key?

Revoke the lost key and create a new one. For security reasons, the full raw key is only shown at creation time.