Supported languages

The real-time API accepts ISO-639-1 language codes (en, es, zh, …). Regional tags are accepted (en-US, pt-BR) — only the primary subtag is used.

Pass auto as sourceLanguage to let Pinch detect the spoken language; the detected code is returned on every transcript frame as detected_language.

Languages

Every code below is supported as both sourceLanguage (input speech) and targetLanguage (output text). The Voice output column marks which languages also have a synthesized voice — use these with audioOutputEnabled=true. Languages without a voice are text-only and must be requested with audioOutputEnabled=false.

Code	Language	Voice output
`ar`	Arabic	✓
`cs`	Czech	✓
`da`	Danish	✓
`de`	German	✓
`el`	Greek	✓
`en`	English	✓
`es`	Spanish	✓
`fa`	Persian (Farsi)	✓
`fi`	Finnish	✓
`fil`	Filipino	—
`fr`	French	✓
`hi`	Hindi	✓
`hu`	Hungarian	✓
`id`	Indonesian	✓
`it`	Italian	✓
`ja`	Japanese	—
`ko`	Korean	—
`mk`	Macedonian	—
`ms`	Malay	—
`nl`	Dutch	✓
`pl`	Polish	✓
`pt`	Portuguese	✓
`ro`	Romanian	✓
`ru`	Russian	✓
`sv`	Swedish	✓
`th`	Thai	—
`tr`	Turkish	✓
`vi`	Vietnamese	✓
`yue`	Cantonese	—
`zh`	Chinese (Mandarin)	✓

Every voice-output language ships with both a male and a female voice — select via voiceType (male or female).

Requesting an unsupported targetLanguage with audioOutputEnabled=true returns an error frame. Pass audioOutputEnabled=false to get transcripts in any of the languages above.