Overview
IntraScribe is a privacy-first, on-premise-capable speech transcription and collaboration platform designed for organizations such as enterprises, schools, and government units. It supports real-time transcription (WebRTC), speaker diarization, high-quality batch transcription, and template-driven AI summarization and title generation. The system emphasizes modular frontends, microservice-based backends, and keeping data within private networks to meet compliance requirements.
Key Features
- Privacy and on-premise deployment: can be deployed in air-gapped or internal networks to prevent data exfiltration.
- Real-time ASR with low latency: WebRTC-based capture and SSE/streaming of transcription segments.
- Speaker diarization and editable transcripts: diarization via pyannote.audio with frontend editing and renaming support.
- High-quality batch re-transcription: post-session processing to improve accuracy and produce structured transcripts.
- Template-driven AI summarization: integrates LiteLLM to generate structured Markdown summaries and concise titles.
Use Cases
- Meeting minutes and knowledge capture for organizations requiring internal-only deployments and auditability.
- Classroom and seminar transcription with multi-speaker annotation and editing.
- Sensitive environments such as command centers or production sites where latency and privacy are critical.
Technical Highlights
- Frontend: Next.js (App Router) + React + TypeScript, with WebRTC integration.
- Backend: microservices built with FastAPI (Python), including STT, diarization and agent services; supports GPU-accelerated models.
- Storage and realtime: Supabase (Postgres + Auth + Storage + Realtime) for data storage and event subscriptions.
- Extensible models: replaceable STT models (e.g., FunASR) and lightweight LLMs (LiteLLM) for summarization.