Skip to main content
Create a personalized Voice Persona through voice verification. This is a single-task, two-phase process: init (upload voice + get verification phrase) → complete (upload verification recording + create persona). The entire flow uses one taskId.

Workflow

 User's voice audio


 ① voicePersona/init
      │  Upload voice → Extract vocals → Return verification phrase

      │  Returns: { taskId }
      │  Poll: GET /suno/v2/status?taskId=xxx
      │  Wait for status == "awaiting"
      │  data: { phrase_text, ... }


 User reads phrase_text aloud and records (within 30s timeout)


 ② voicePersona/complete (same taskId)
      │  Upload verification recording → Voice verification → Create Persona

      │  Poll same taskId: GET /suno/v2/status?taskId=xxx
      │  Wait for status == "success"
      │  data: persona details

    Done → Use persona in /generate

Task Status Flow

queued → running → awaiting → running → success
                      │                    │
                      │                    └── complete failed → failed
                      └── User timeout → failed (VP_USER_TIMEOUT)
After the task reaches awaiting status, you must call complete within 30 seconds (default). If the timeout is exceeded, the task will fail with VP_USER_TIMEOUT and you’ll need to restart from init.

Step 1: Init — Upload Voice & Get Verification Phrase

Upload the user’s voice audio. The system extracts vocals and returns a verification phrase that the user must read aloud.
This is an async task. Poll Get Task Status with the returned taskId. Wait for status to become awaiting (not success).

Request

POST /suno/v2/voicePersona/init
FieldTypeRequiredDescription
voice_audio_urlstring (URL)YesPublicly downloadable URL of the voice audio (WAV/MP3)
languagestringYesVerification phrase language: zh en ja ko es fr de pt ru hi
vocal_start_snumberNoVocal extraction start time (seconds), default: 0
vocal_end_snumberNoVocal extraction end time (seconds), default: auto-detected

Polling Result (status: awaiting)

When the task reaches awaiting status, data contains:
FieldDescription
phrase_textVerification phrase text — user must read this aloud and record
phrase_idVerification phrase ID (internal)
vox_audio_idExtracted vocal audio ID (internal)
voice_recording_idRecording ID (internal)
vocal_start_sVocal start time (seconds)
vocal_end_sVocal end time (seconds)
Only phrase_text is needed by the user. All other fields are used internally by the system — you do not need to pass them to the complete step.
See Init API Reference →

Step 2: Complete — Upload Verification Recording & Create Persona

After the user reads phrase_text aloud and records it, upload the verification recording using the same taskId to complete voice verification and create the persona.
Uses the same taskId from init. After calling complete, continue polling the same taskId until status becomes success.

Request

POST /suno/v2/voicePersona/complete
FieldTypeRequiredDescription
taskIdstring (UUID)YesThe taskId from init (same task)
verification_audio_urlstring (URL)YesUser’s verification recording URL (WAV/MP3)
namestringYesPersona name
descriptionstringNoPersona description
is_publicbooleanNoWhether public (default: false)
image_s3_idstringNoCover image (base64), auto-generated if not provided
No intermediate data (vox_audio_id, phrase_id, etc.) is needed — the system reads them automatically from the init phase.
See Complete API Reference →

Complete Example

const API_BASE = 'https://api.mountsea.ai';
const headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer your-api-key'
};

async function pollTask(taskId, targetStatus = 'success') {
  while (true) {
    const res = await fetch(`${API_BASE}/suno/v2/status?taskId=${taskId}`, { headers });
    const task = await res.json();
    if (task.status === targetStatus) return task.data;
    if (task.status === 'success') return task.data;
    if (task.status === 'failed') throw new Error(task.failReason);
    await new Promise(r => setTimeout(r, 3000));
  }
}

// Step 1: Init — upload voice and get verification phrase
const initRes = await fetch(`${API_BASE}/suno/v2/voicePersona/init`, {
  method: 'POST',
  headers,
  body: JSON.stringify({
    voice_audio_url: 'https://example.com/my-voice.wav',
    language: 'zh'
  })
});
const { taskId } = await initRes.json();

// Poll until status is "awaiting"
const initData = await pollTask(taskId, 'awaiting');
console.log('Please read aloud:', initData.phrase_text);

// → User records themselves reading the phrase ...

// Step 2: Complete — upload verification recording (same taskId)
await fetch(`${API_BASE}/suno/v2/voicePersona/complete`, {
  method: 'POST',
  headers,
  body: JSON.stringify({
    taskId,
    verification_audio_url: 'https://example.com/verification.wav',
    name: 'My Voice',
    description: '我的专属声音'
  })
});

// Poll the SAME taskId until status is "success"
const persona = await pollTask(taskId, 'success');
console.log('Voice Persona created:', persona);

Error Codes

CodeErrorDescription
400VP_TASK_NOT_FOUNDtaskId does not exist or is not a Voice Persona task
400VP_INVALID_STATUSTask status is not awaiting, cannot call complete
408VP_USER_TIMEOUTTimeout waiting for complete after init (default 30s)
409VP_SESSION_EXPIREDVerification session expired, restart from init
500VP_LOCK_EXPIREDInternal lock expired (retry)
503VP_NO_DEDICATED_ACCOUNT_AVAILABLENo dedicated account available
503VP_ALL_ACCOUNTS_BUSYAll account queues are full, retry later
504VP_ORPHAN_TIMEOUTTask queuing timeout

Important Notes

The verification recording must clearly contain the full phrase_text content. Incomplete or unclear recordings will cause voice verification to fail.
  • Single taskId lifecycle: Init and complete use the same taskId — poll one task throughout the entire flow.
  • awaiting status: After init completes, the task status is awaiting (not success). The data field contains phrase_text for the user to read.
  • 30s time limit: You must call complete within 30 seconds after the task reaches awaiting. Exceeding this causes VP_USER_TIMEOUT.
  • Simplified parameters: complete only needs taskId + verification recording URL + persona info. All intermediate data is auto-filled by the system.
  • Same account guarantee: Both phases automatically use the same Suno account.
  • Language selection: language determines the verification phrase language. Match the language of the original voice audio for best results.
  • Processing time: Init takes ~20-60s (includes vocal extraction); Complete takes ~10-30s (includes voice verification).
  • Concurrency safety: The system serializes Voice Persona operations per account — concurrent requests from different users won’t interfere.