AI transcription software that actually delivers

Excellent accuracy on clear audio, speaker recognition, 100+ languages, and export in every format you need. No credit card required. Built for creators, teams, and enterprises.

Transcription & speech to text FAQs

Accuracy, speaker ID, export formats, and plan limits

Speaker identification (Pro and above) uses voice activity detection and speaker embedding models to identify unique voice patterns. The system analyzes pitch, tone, and speech patterns to distinguish speakers and assign labels like "Speaker 1", "Speaker 2". It works best with clear audio and distinct voices and can handle 2-10+ speakers. Speaker labels are maintained consistently throughout the transcript.

Yes! On Pro plans and above, you can customize summary length, focus areas, and output format. You can request brief summaries, detailed summaries, or summaries focused on specific topics like action items, decisions, or key quotes. The AI can also extract specific information like dates, names, or topics based on your needs.

Major languages (e.g. English, Spanish, French, German, Italian, Portuguese, Mandarin) usually see the highest accuracy on clear audio. Many European and major Asian languages perform well; lower-resource or regional languages can show more variance. Exact word accuracy depends on the model, language, accent, and recording quality — use clear audio for best results.

Batch processing allows you to upload multiple files at once (up to 50 files on Studio plans, unlimited on Agency/Enterprise). Files are processed in parallel for maximum efficiency. You can track progress for each file individually, and all transcripts are available in your dashboard once complete. Batch exports are also supported for downloading multiple transcripts at once.

Real-time transcription via API is coming soon for Enterprise customers. It will use WebSocket connections for streaming audio and will provide sub-second latency. The current API supports file uploads with async processing and webhook notifications when transcription completes.

We support common audio formats (AAC, FLAC, M4A, MP3, MPEG, MPGA, OGG, WAV) and video formats (MP4, WEBM). File size limits may apply by plan. For very large files or custom limits, contact us for enterprise options.

Translation accuracy depends on the source and target languages. For major language pairs (English ↔ Spanish, French, German, etc.), accuracy is 90%+. For less common pairs, accuracy may be 80-85%. Translations maintain context and proper grammar, making them suitable for understanding content, though professional translation may be needed for publication.

Yes! All transcripts can be edited directly in the dashboard. You can correct errors, add speaker names, format text, and make any changes needed. All edits are saved automatically and can be re-exported in any format. Edits don't affect the original audio file.

Custom vocabulary (Agency and Enterprise) lets you train the system on industry-specific terms, technical jargon, brand names, and specialized terminology. You provide terms and pronunciations, and the system improves accuracy for those terms. Especially useful for legal, medical, technical, or brand-specific content.

API access is available on Agency and Enterprise plans. When you submit a transcription job via API, you can specify a webhook URL. When the job completes, we send a POST request to your webhook with transcript data, status, and metadata. Webhooks include retry logic and signature verification for security.

Yes! Word-level timestamps are available on Pro plans and above. They're included in JSON exports and can be used for precise video editing, subtitle creation, and advanced applications. Word-level timestamps show the exact start and end time for each word in the transcript.

The AI Q&A feature (Pro+) allows you to ask questions about your transcript content. The AI analyzes the full transcript and provides answers based on the content. You can ask about specific topics, people mentioned, decisions made, action items, or any other information in the transcript. Answers are contextual and accurate.

Enterprise plans include SOC2 Type II, GDPR, and HIPAA compliance. We maintain annual certifications and undergo regular security audits. For specific compliance requirements, contact our enterprise team. All plans include basic security features like encryption and data ownership.

Agency and Enterprise plans include API access for custom integrations. Our API documentation provides guides and code examples. Pre-built integrations for popular tools are coming soon. Enterprise customers can request dedicated support for custom integrations.

You can download and backup transcripts at any time. Storage and retention may vary by plan; check your plan details for current policy. Deleted transcripts are permanently removed according to our data retention policy.

If a transcription fails, you'll be notified via email and in-app notification. The file won't count against your quota. You can retry or recover failed or incomplete transcripts from the dashboard or via API—no lost work. Common failure reasons include corrupted files, unsupported formats, or processing errors. Failed jobs are automatically retried once before notification.

Free-tier exports may include a watermark. Pro, Studio, Agency, and Enterprise plans have no watermark on exports.

Yes. Use the dashboard folders page to create folders and organize your transcriptions by project, client, or show. Available on all plans.

Yes! You can upload recordings of phone calls in any supported format. For best results, use clear recordings with minimal background noise. Phone call transcriptions work well for customer support, interviews, and business calls. Real-time phone call transcription is coming soon for Enterprise customers.