Voice Agents use STT (Speech to Text), to transcribe what the user speaks. This transcribed speech as text goes into an LLM to generate the response that gets played out to the user.
Dograh platform ships with Deepgram, Cartesia, OpenAI and Dograh transcribers by default. You can take a look at the providers documentation of which language to select for your language requirements.Example: Deepgram has their language support documentation at https://developers.deepgram.com/docs/models-languages-overview#nova-3