PDFMathTranslate

A scientific PDF translation tool that preserves layout and mathematical formulas while supporting multiple translation backends.

Author: Byaidu

Added Date: 2025-09-27

Open Source Since: 2024-09-06

Visit Website GitHub Demo

Overview

PDFMathTranslate translates scientific PDFs while attempting to preserve formulas, charts, table-of-contents and annotations. It supports multiple translation services, offers a command-line interface, a GUI, Docker images, and integrations such as a Zotero plugin.

Key features

Layout preservation: retains mathematical formulas, tables and figures as much as possible to minimize post-editing.
Multiple backends: supports Google, DeepL, OpenAI, Ollama and custom backends with caching.
Flexible deployment: CLI, GUI, Docker images, and plugins for different workflows.

Use cases

Batch translating academic papers while keeping readable layouts for reviewers and collaborators.
Generating bilingual documents for comparison, teaching, or accessible reading.
Deploying in air-gapped or enterprise environments via Docker or local installs.

Technical details

Uses document parsing libraries (e.g., PyMuPDF, pdfminer.six) and layout recognition modules to handle complex formatting.
Supports concurrent translation, chunking, and caching to improve throughput and reliability.
Exposes Python API and HTTP endpoints for integration into downstream systems such as literature managers and automated summarizers.

PDFMathTranslate

Overview

Key features

Use cases

Technical details

Resource Info

Related Resources

Glow

LangREPL

MONAI