Voice Persona

Create a personalized Voice Persona through voice verification. This is a single-task, two-phase process: init (upload voice + get verification phrase) → complete (upload verification recording + create persona). The entire flow uses one taskId.

Workflow

 User's voice audio
      │
      ▼
 ① voicePersona/init
      │  Upload voice → Extract vocals → Return verification phrase
      │
      │  Returns: { taskId }
      │  Poll: GET /suno/v2/status?taskId=xxx
      │  Wait for status == "awaiting"
      │  data: { phrase_text, ... }
      │
      ▼
 User reads phrase_text aloud and records (within 30s timeout)
      │
      ▼
 ② voicePersona/complete (same taskId)
      │  Upload verification recording → Voice verification → Create Persona
      │
      │  Poll same taskId: GET /suno/v2/status?taskId=xxx
      │  Wait for status == "success"
      │  data: persona details
      ▼
    Done → Use persona in /generate

Task Status Flow

queued → running → awaiting → running → success
                      │                    │
                      │                    └── complete failed → failed
                      └── User timeout → failed (VP_USER_TIMEOUT)

After the task reaches awaiting status, you must call complete within 30 seconds (default). If the timeout is exceeded, the task will fail with VP_USER_TIMEOUT and you’ll need to restart from init.

Step 1: Init — Upload Voice & Get Verification Phrase

Upload the user’s voice audio. The system extracts vocals and returns a verification phrase that the user must read aloud.

This is an async task. Poll Get Task Status with the returned taskId. Wait for status to become awaiting (not success).

Request

POST /suno/v2/voicePersona/init

Field	Type	Required	Description
`voice_audio_url`	string (URL)	Yes	Publicly downloadable URL of the voice audio (WAV/MP3)
`language`	string	Yes	Verification phrase language: `zh` `en` `ja` `ko` `es` `fr` `de` `pt` `ru` `hi`
`vocal_start_s`	number	No	Vocal extraction start time (seconds), default: 0
`vocal_end_s`	number	No	Vocal extraction end time (seconds), default: auto-detected

Polling Result (status: awaiting)

When the task reaches awaiting status, data contains:

Field	Description
`phrase_text`	Verification phrase text — user must read this aloud and record
`phrase_id`	Verification phrase ID (internal)
`vox_audio_id`	Extracted vocal audio ID (internal)
`voice_recording_id`	Recording ID (internal)
`vocal_start_s`	Vocal start time (seconds)
`vocal_end_s`	Vocal end time (seconds)

Only phrase_text is needed by the user. All other fields are used internally by the system — you do not need to pass them to the complete step.

See Init API Reference →

Step 2: Complete — Upload Verification Recording & Create Persona

After the user reads phrase_text aloud and records it, upload the verification recording using the same taskId to complete voice verification and create the persona.

Uses the same taskId from init. After calling complete, continue polling the same taskId until status becomes success.

Request

POST /suno/v2/voicePersona/complete

Field	Type	Required	Description
`taskId`	string (UUID)	Yes	The taskId from init (same task)
`verification_audio_url`	string (URL)	Yes	User’s verification recording URL (WAV/MP3)
`name`	string	Yes	Persona name
`description`	string	No	Persona description
`is_public`	boolean	No	Whether public (default: false)
`image_s3_id`	string	No	Cover image (base64), auto-generated if not provided

No intermediate data (vox_audio_id, phrase_id, etc.) is needed — the system reads them automatically from the init phase.

See Complete API Reference →

Complete Example

const API_BASE = 'https://api.mountsea.ai';
const headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer your-api-key'
};

async function pollTask(taskId, targetStatus = 'success') {
  while (true) {
    const res = await fetch(`${API_BASE}/suno/v2/status?taskId=${taskId}`, { headers });
    const task = await res.json();
    if (task.status === targetStatus) return task.data;
    if (task.status === 'success') return task.data;
    if (task.status === 'failed') throw new Error(task.failReason);
    await new Promise(r => setTimeout(r, 3000));
  }
}

// Step 1: Init — upload voice and get verification phrase
const initRes = await fetch(`${API_BASE}/suno/v2/voicePersona/init`, {
  method: 'POST',
  headers,
  body: JSON.stringify({
    voice_audio_url: 'https://example.com/my-voice.wav',
    language: 'zh'
  })
});
const { taskId } = await initRes.json();

// Poll until status is "awaiting"
const initData = await pollTask(taskId, 'awaiting');
console.log('Please read aloud:', initData.phrase_text);

// → User records themselves reading the phrase ...

// Step 2: Complete — upload verification recording (same taskId)
await fetch(`${API_BASE}/suno/v2/voicePersona/complete`, {
  method: 'POST',
  headers,
  body: JSON.stringify({
    taskId,
    verification_audio_url: 'https://example.com/verification.wav',
    name: 'My Voice',
    description: '我的专属声音'
  })
});

// Poll the SAME taskId until status is "success"
const persona = await pollTask(taskId, 'success');
console.log('Voice Persona created:', persona);

Error Codes

Code	Error	Description
400	VP_TASK_NOT_FOUND	taskId does not exist or is not a Voice Persona task
400	VP_INVALID_STATUS	Task status is not `awaiting`, cannot call complete
408	VP_USER_TIMEOUT	Timeout waiting for complete after init (default 30s)
409	VP_SESSION_EXPIRED	Verification session expired, restart from init
500	VP_LOCK_EXPIRED	Internal lock expired (retry)
503	VP_NO_DEDICATED_ACCOUNT_AVAILABLE	No dedicated account available
503	VP_ALL_ACCOUNTS_BUSY	All account queues are full, retry later
504	VP_ORPHAN_TIMEOUT	Task queuing timeout

Important Notes

The verification recording must clearly contain the full phrase_text content. Incomplete or unclear recordings will cause voice verification to fail.

Single taskId lifecycle: Init and complete use the same taskId — poll one task throughout the entire flow.
awaiting status: After init completes, the task status is awaiting (not success). The data field contains phrase_text for the user to read.
30s time limit: You must call complete within 30 seconds after the task reaches awaiting. Exceeding this causes VP_USER_TIMEOUT.
Simplified parameters: complete only needs taskId + verification recording URL + persona info. All intermediate data is auto-filled by the system.
Same account guarantee: Both phases automatically use the same Suno account.
Language selection: language determines the verification phrase language. Match the language of the original voice audio for best results.
Processing time: Init takes ~20-60s (includes vocal extraction); Complete takes ~10-30s (includes voice verification).
Concurrency safety: The system serializes Voice Persona operations per account — concurrent requests from different users won’t interfere.

Suno

ElevenLabs

Producer

Sora2

Google(Gemini)

XAI (Grok)

Chat (Multi-Protocol AI Gateway)

Usage & Credits

Workflow

Task Status Flow

Step 1: Init — Upload Voice & Get Verification Phrase

Request

Polling Result (status: awaiting)

Step 2: Complete — Upload Verification Recording & Create Persona

Request

Complete Example

Error Codes

Important Notes

Suno

ElevenLabs

Producer

Sora2

Google(Gemini)

XAI (Grok)

Chat (Multi-Protocol AI Gateway)

Usage & Credits

​Workflow

​Task Status Flow

​Step 1: Init — Upload Voice & Get Verification Phrase

​Request

​Polling Result (status: awaiting)

​Step 2: Complete — Upload Verification Recording & Create Persona

​Request

​Complete Example

​Error Codes

​Important Notes

Workflow

Task Status Flow

Step 1: Init — Upload Voice & Get Verification Phrase

Request

Polling Result (status: awaiting)

Step 2: Complete — Upload Verification Recording & Create Persona

Request

Complete Example

Error Codes

Important Notes