Speech to Text
Whisper V2 API
Speech to Text
Whisper V2 API
Process audio files for transcription or translation with advanced options
POST
/
model
/
v2
/
infer
/
whisper
curl --request POST \
--url https://http.whisper.proxy.prod.s9t.link/model/v2/infer/whisper \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"audio_data": "base64_encoded_audio_content",
"language": "en",
"task": "transcribe",
"beam_size": 5,
"best_of": 5,
"word_timestamps": 1,
"diarization": 0,
"streaming": 0,
"batch_size": 24,
"length_penalty": 1,
"patience": 1,
"vad_onset": 0.5,
"vad_offset": 0.363
}'
{
"transcription": [
"<string>"
],
"segments": [
{
"start": 123,
"end": 123,
"text": "<string>",
"words": [
{
"word": "<string>",
"start": 123,
"end": 123
}
]
}
],
"request_time": 123,
"language": "<string>"
}
Authorizations
JWT token for authentication
Body
application/json
Response
200
application/json
Successful transcription
The response is of type object
.
curl --request POST \
--url https://http.whisper.proxy.prod.s9t.link/model/v2/infer/whisper \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"audio_data": "base64_encoded_audio_content",
"language": "en",
"task": "transcribe",
"beam_size": 5,
"best_of": 5,
"word_timestamps": 1,
"diarization": 0,
"streaming": 0,
"batch_size": 24,
"length_penalty": 1,
"patience": 1,
"vad_onset": 0.5,
"vad_offset": 0.363
}'
{
"transcription": [
"<string>"
],
"segments": [
{
"start": 123,
"end": 123,
"text": "<string>",
"words": [
{
"word": "<string>",
"start": 123,
"end": 123
}
]
}
],
"request_time": 123,
"language": "<string>"
}