A guide to building long-term compounding knowledge infrastructure. See details on GitHub .

IntraScribe

A privacy-first, on-premise-ready speech transcription and collaboration platform with real-time ASR, speaker diarization, batch transcription and AI summarization.

Overview

IntraScribe is a privacy-first, on-premise-capable speech transcription and collaboration platform designed for organizations such as enterprises, schools, and government units. It supports real-time transcription (WebRTC), speaker diarization, high-quality batch transcription, and template-driven AI summarization and title generation. The system emphasizes modular frontends, microservice-based backends, and keeping data within private networks to meet compliance requirements.

Key Features

  • Privacy and on-premise deployment: can be deployed in air-gapped or internal networks to prevent data exfiltration.
  • Real-time ASR with low latency: WebRTC-based capture and SSE/streaming of transcription segments.
  • Speaker diarization and editable transcripts: diarization via pyannote.audio with frontend editing and renaming support.
  • High-quality batch re-transcription: post-session processing to improve accuracy and produce structured transcripts.
  • Template-driven AI summarization: integrates LiteLLM to generate structured Markdown summaries and concise titles.

Use Cases

  • Meeting minutes and knowledge capture for organizations requiring internal-only deployments and auditability.
  • Classroom and seminar transcription with multi-speaker annotation and editing.
  • Sensitive environments such as command centers or production sites where latency and privacy are critical.

Technical Highlights

  • Frontend: Next.js (App Router) + React + TypeScript, with WebRTC integration.
  • Backend: microservices built with FastAPI (Python), including STT, diarization and agent services; supports GPU-accelerated models.
  • Storage and realtime: Supabase (Postgres + Auth + Storage + Realtime) for data storage and event subscriptions.
  • Extensible models: replaceable STT models (e.g., FunASR) and lightweight LLMs (LiteLLM) for summarization.

Comments

IntraScribe
Resource Info
🌱 Open Source 🔊 Audio 📱 Application