Voice Input

What is Voice Input?

Voice Input is TalkCody's speech-to-text feature that allows you to input through voice without manual typing. This is very convenient for quickly entering long text, describing programming problems, or using the app in mobile scenarios.

TalkCody supports multiple transcription providers, and you can choose the appropriate service based on your needs.

Supported Transcription Providers

Provider	Model	Features
Eleven Labs	Scribe	High-quality real-time multilingual transcription
OpenAI	Whisper	Industry-leading speech recognition, supports multiple languages
Google	Gemini	Multimodal understanding, supports long audio

You need to configure at least one transcription provider's API Key and select a transcription model in settings to use the voice input feature.

Configuration

Get API Key

Obtain the API Key for the transcription service you want to use:

Eleven Labs (Recommended)

Visit elevenlabs.io
Click "Sign Up" to register an account
After logging in, click your avatar in the top right → "Profile + API key"
Click the eye icon in the API Key section to view the key
Copy the API key

OpenAI

Visit platform.openai.com
Log in or register an account
Click "Create new secret key" to create a key
Copy the key (format: sk-...)

Google AI

Visit aistudio.google.com
Log in with your Google account
Click "Create API key in new project"
Copy the generated key

Configure API Key

Open TalkCody
Click the Settings icon
Navigate to the API Keys page
Paste your API Key in the corresponding provider's input field
Click Test Key to verify it works

Select Transcription Model

In the Settings page, navigate to Model Settings
Find the Transcription Model option
Select the transcription model you want to use from the dropdown:
- eleven_scribe_v1 - Eleven Labs Scribe
- whisper-1 - OpenAI Whisper
- gemini-2.0-flash - Google Gemini

How to Use

After configuration, you can use the voice input feature in the chat input box:

Click the microphone icon next to the input box
Allow browser access to the microphone (first time only)
Start speaking, your voice will be recorded in real-time
Click the Stop button when finished
The speech will be automatically transcribed to text and filled into the input box
You can edit the transcribed text, then send

Speaking clearly at a moderate pace will give you better transcription results.

Provider Features

Eleven Labs Scribe

Supports multilingual transcription in 32 languages
High accuracy, especially suitable for technical terminology
Free tier provides 10,000 characters per month

OpenAI Whisper

Industry-leading speech recognition accuracy
Supports multiple audio formats
Can identify the speaker's language

Google Gemini

Multimodal understanding capabilities
Supports long audio transcription
Google AI provides generous free quotas

Voice Input

On This Page