Whisper ASR#
Whisper ASR checkpoints can be started through the OpenAI-compatible /v1/audio/transcriptions endpoint, but this path is experimental in the current SGLang-Omni tree. Prefer Qwen3-ASR for validated ASR serving.
Prerequisites#
Install sglang-omni by following Installation, then download a Whisper checkpoint:
hf download openai/whisper-large-v3
Server Configuration#
Whisper ASR runs a single ASR stage on one GPU.
sgl-omni serve \
--model-path openai/whisper-large-v3 \
--port 8000
Transcribe Audio#
curl -X POST http://localhost:8000/v1/audio/transcriptions \
-F model=openai/whisper-large-v3 \
-F file=@tests/data/query_to_cars.wav \
-F response_format=json
import requests
with open("tests/data/query_to_cars.wav", "rb") as f:
resp = requests.post(
"http://localhost:8000/v1/audio/transcriptions",
data={
"model": "openai/whisper-large-v3",
"response_format": "json",
},
files={"file": ("query_to_cars.wav", f, "audio/wav")},
timeout=300,
)
resp.raise_for_status()
print(resp.json()["text"])
Request Parameters#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
file |
required |
Audio file uploaded as multipart form data |
|
string |
server default |
Model identifier |
|
string |
unset |
Optional language hint |
|
string |
|
Use |
|
float |
unset |
Optional sampling temperature |
The request builder also supports task (transcribe by default) and
max_new_tokens, but the public transcription endpoint currently exposes only
the fields above. The route uses the ASR stage default unless the pipeline is
configured another way. For smoke tests, keep the request minimal and use
response_format=json.
Known Limitations#
This path is experimental and not yet correctness-validated. Prefer Qwen3-ASR for validated ASR serving.
Keep Whisper ASR at encoder batch size 1.
Use
response_format=json; other response formats are not validated for this experimental path.First startup can take several minutes.
The endpoint accepts one uploaded file per request.
Audio is resampled to 16 kHz before transcription.
promptis accepted by the HTTP endpoint for OpenAI compatibility, but Whisper ASR currently does not pass it into decoding.