Speaker Diarizer

Answers "who spoke when?" in a meeting or interview recording. Separates speakers automatically and produces a timestamped text report.

What it does

Use this to break a meeting into per-speaker quotes, distinguish the interviewer from the interviewee, or pull individual contributions out of a panel recording.

How to use

  1. Drag meeting/interview recordings into the list.
  2. Pick the Expected Speaker Count (Auto-Detect or a number from 2 to 10).
  3. Pick a Neural Mode: Standard, High Precision, or Turbo.
  4. Click Run.

You get one timestamped .txt report per file. Format:

[00:00:05] SPEAKER_01: Hi, welcome.
[00:00:12] SPEAKER_02: Thanks.

Speaker count

  • Auto-Detect: The system estimates how many speakers there are. Try this first.
  • A number from 2 to 10: If you know exactly how many people spoke, this gives better accuracy.

Quality modes

ModeWhat for
StandardBalanced, the everyday choice.
High PrecisionImportant recordings, production use. Slower but more accurate.
TurboFast draft, lower accuracy.

Examples

Break a meeting into speakers: Add the meeting, Auto-Detect, Standard, run. Each speaker is labelled SPEAKER_01, SPEAKER_02 and so on.

Q&A split for an interview: Add the interview, Speakers 2, High Precision, run. Both speakers come out cleanly separated.

Analyse a panel recording: Add the panel, Speakers 4, Standard, run.

Batch interview archive: Add many interviews, Speakers 2, run. Each one gets its own report.

Watch out

  • The first run downloads the Whisper model from the internet. Later runs are offline.
  • Speakers are labelled SPEAKER_01, SPEAKER_02 - not real names. You can rename them by hand afterwards.
  • Very similar voices (siblings, same gender) may get confused.
  • Very crowded settings (10+ people) hurt accuracy.
  • Music or ambience-heavy recordings produce inconsistent results.
  • For a single speaker just run transcription, no need for diarization.

License

This tool runs in full inside the Ultimate plan. The Free tier has a monthly cap.