Skip to content

1. Speech-to-Text (STT)

Endpoint

POST /api/v1/stt
Content-Type: multipart/form-data

Request Parameters

Parameter Type Description
audio file Audio file (MP3, WAV, M4A)
language string Language code (uz, ru, en). If not specified → auto-detected
blocking boolean true = wait for completion, false = return identifier
webhook_url string URL for POST request with result

Limits

Limit Value
Max audio duration 60 minutes
Sync (blocking) threshold 2 minutes

Notes:

  • If audio duration > 2 minutes → automatically switches to async mode. Use blocking=false and provide webhook_url.
  • If audio duration > 60 minutes → error returned (audio_too_long).
  • webhook_url is required for non-blocking (async) tasks.

Pricing

Duration Cost
1 – 60 seconds 300 UZS (flat)
60+ seconds 5 UZS per second (seconds × 5)

Examples: 72s = 360 UZS, 120s = 600 UZS, 300s = 1,500 UZS, 600s = 3,000 UZS

Example Request

curl -X POST "https://developer.kotib.ai/api/v1/stt" \
 -H "Authorization: Bearer <api-key>" \
 -F "audio=@audio.mp3" \
 -F "language=uz" \
 -F "blocking=false"

Example Response (blocking = false)

{
  "status": "processing",
  "id": "stt_123456789",
  "message": "Task accepted. Check status via /get-status."
}

Example Response (blocking = true)

{
  "status": "success",
  "text": "Salom, dunyo!"
}

- ← Authentication | Translation →

Quick Navigation

- ← Previous: Authentication | Next: Translation →

- ← Back to Home