Recording Modes
Automatic (VAD)
Recording starts automatically when you speak and stops during pauses. Perfect for continuous interviews.
How it works:
- The application analyzes audio using a neural network that accurately distinguishes speech from background noise
- As soon as speech is detected — recording begins
- When silence occurs — recording ends and is sent for transcription
- If speech lasts long, recording is automatically split into parts (chunks)
Manual (Toggle)
You control the start and end of recording with the hotkey Ctrl+R. Suitable when it's important to record only specific moments.
The application constantly records audio in a background buffer. When you press Ctrl+R, these seconds are added to the beginning of the recording — part of the conversation before pressing will still be saved. Recordings are validated by the neural network — if no speech is detected, the recording is automatically discarded.
One-shot (Oneshot)
Works similarly to manual mode, but instead of recording from start to end, it captures a fixed fragment from the buffer. Press the hotkey — and the application saves the last N seconds of audio.
Suitable when you need to quickly capture what was just said without thinking about turning recording on and off.
Selecting Recording Mode
Open settings (gear icon in the side menu), section "Audio Recording". Here you can select the recording mode and configure other parameters.
General Settings
Audio Source
| Mode | Description |
|---|---|
| System audio + microphone | Records system audio (interlocutor) and microphone (you). Perfect for transcribing dialogues |
| System audio only | Records only system audio. Useful if you only need the interlocutor's speech |
| Microphone only | Records only microphone. Use if system audio is not needed or causes problems |
Microphone
In settings, you can select a specific microphone. If not selected — the system default is used.
Audio Output Device (Windows)
On Windows, you can select a device for capturing system audio — for example, headphones or speakers. The application will record audio that is played through the selected device.
Automatic Mode Settings (VAD)
Split into Chunks
Automatically splits long recordings into separate files.
Why this is needed: if the interlocutor speaks for a minute without pauses, and the question was already asked at the beginning — the application will send the first part for transcription and start generating a response while the interlocutor is still finishing.
Chunk Length
Maximum duration of one audio file. After reaching this time, recording will be saved and a new chunk will begin.
Range: from 5 to 10 seconds. Default: 7 seconds.
Manual Mode Settings (Toggle)
Buffer Length
The application constantly records audio in a background buffer. When you press Ctrl+R, these seconds are added to the beginning of the recording. Useful if you didn't press the hotkey in time.
Range: from 0 to 15 seconds. Default: 4 seconds.
One-shot Mode Settings (Oneshot)
Snapshot Duration
How many seconds of audio to capture from the buffer when the hotkey is pressed.
Range: from 5 to 30 seconds. Default: 20 seconds.
Clear Buffer After Snapshot
If enabled, the buffer is cleared after each snapshot. This prevents sending the same audio fragment again on multiple consecutive presses.
Frequently Asked Questions
Recording triggers from noise
The neural network filters non-speech sounds well, but high background noise levels may cause false triggers. Try reducing microphone sensitivity in system settings or use a headset.
Cuts off the beginning of phrases
In manual mode — increase the buffer length.
Recordings are discarded as empty
Neural network checks each recording for speech presence. If the model doesn't detect voice (for example, only background noise or music was recorded), the recording is automatically discarded. Make sure your microphone is properly configured and your speech is clear enough.