Voice Input
Using speech-to-text for input in TalkCody
What is Voice Input?
Voice Input is TalkCody's speech-to-text feature that allows you to input through voice without manual typing. This is very convenient for quickly entering long text, describing programming problems, or using the app in mobile scenarios.
TalkCody supports multiple transcription providers, and you can choose the appropriate service based on your needs.
Supported Transcription Providers
| Provider | Model | Features |
|---|---|---|
| Eleven Labs | Scribe | High-quality real-time multilingual transcription |
| OpenAI | Whisper | Industry-leading speech recognition, supports multiple languages |
| Gemini | Multimodal understanding, supports long audio |
You need to configure at least one transcription provider's API Key and select a transcription model in settings to use the voice input feature.
Configuration
Get API Key
Obtain the API Key for the transcription service you want to use:
Eleven Labs (Recommended)
- Visit elevenlabs.io
- Click "Sign Up" to register an account
- After logging in, click your avatar in the top right → "Profile + API key"
- Click the eye icon in the API Key section to view the key
- Copy the API key
OpenAI
- Visit platform.openai.com
- Log in or register an account
- Click "Create new secret key" to create a key
- Copy the key (format:
sk-...)
Google AI
- Visit aistudio.google.com
- Log in with your Google account
- Click "Create API key in new project"
- Copy the generated key
Configure API Key
- Open TalkCody
- Click the Settings icon
- Navigate to the API Keys page
- Paste your API Key in the corresponding provider's input field
- Click Test Key to verify it works
Select Transcription Model
- In the Settings page, navigate to Model Settings
- Find the Transcription Model option
- Select the transcription model you want to use from the dropdown:
eleven_scribe_v1- Eleven Labs Scribewhisper-1- OpenAI Whispergemini-2.0-flash- Google Gemini
How to Use
After configuration, you can use the voice input feature in the chat input box:
- Click the microphone icon next to the input box
- Allow browser access to the microphone (first time only)
- Start speaking, your voice will be recorded in real-time
- Click the Stop button when finished
- The speech will be automatically transcribed to text and filled into the input box
- You can edit the transcribed text, then send
Speaking clearly at a moderate pace will give you better transcription results.
Provider Features
Eleven Labs Scribe
- Supports multilingual transcription in 32 languages
- High accuracy, especially suitable for technical terminology
- Free tier provides 10,000 characters per month
OpenAI Whisper
- Industry-leading speech recognition accuracy
- Supports multiple audio formats
- Can identify the speaker's language
Google Gemini
- Multimodal understanding capabilities
- Supports long audio transcription
- Google AI provides generous free quotas