Kreuzberg icon
Kreuzberg icon

Kreuzberg

A polyglot document intelligence framework with a Rust core. Extract text, metadata, and structured information from 75+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, TypeScript (Node/Bun/Wasm/Deno)- or use via CLI, REST API, or MCP server.

Cost / License

  • Free
  • Open Source (MIT)

Platforms

  • Mac
  • Windows
  • Linux
  • Python
-
No reviews
0likes
0comments
0alternatives
0news articles

Features

  1.  OCR
  2.  Python-based
  3.  Rust
  4.  Extract text from image

 Tags

Kreuzberg News & Activities

Highlights All activities

Recent activities

Kreuzberg information

  • Developed by

    Na'aman Hirschfeld
  • Licensing

    Open Source (MIT) and Free product.
  • Alternatives

    0 alternatives listed
  • Supported Languages

    • English
    • Chinese
    • French
    • Japanese
    • Korean
    • German

AlternativeTo Categories

DevelopmentOS & UtilitiesOffice & Productivity

GitHub repository

  •  5,781 Stars
  •  240 Forks
  •  2 Open Issues
  •   Updated  
View on GitHub
Kreuzberg was added to AlternativeTo by Paul on and this page was last updated .
No comments or reviews, maybe you want to be first?

What is Kreuzberg?

Extract text and metadata from a wide range of file formats (75+), generate embeddings and post-process at native speeds without needing a GPU.

Key Features Extensible architecture – Plugin system for custom OCR backends, validators, post-processors, and document extractors Polyglot – Native bindings for Rust, Python, TypeScript/Node.js, Ruby, Go, Java, C#, PHP, and Elixir 75+ file formats – PDF, Office documents, images, HTML, XML, emails, archives, academic formats across 8 categories OCR support – Tesseract (all bindings), PaddleOCR (all native bindings), EasyOCR (Python), extensible via plugin API High performance – Rust core with native PDFium, SIMD optimizations and full parallelism Flexible deployment – Use as library, CLI tool, REST API server, or MCP server Memory efficient – Streaming parsers for multi-GB files Complete Documentation | Installation Guides

Installation Each language binding provides comprehensive documentation with examples and best practices. Choose your platform to get started: Key Features OCR with Table Extraction Batch Processing Password-Protected PDFs Language Detection Metadata Extraction AI Coding Assistants Kreuzberg ships with an Agent Skill that teaches AI coding assistants how to use the library correctly. It works with Claude Code, Codex, Gemini CLI, Cursor, VS Code, Amp, Goose, Roo Code, and any tool supporting the Agent Skills standard.

Documentation: https://docs.kreuzberg.dev/

Contributing Contributions are welcome! https://github.com/kreuzberg-dev/kreuzberg

License MIT License - see LICENSE for details. You can use Kreuzberg freely in both commercial and closed-source products with no obligations, no viral effects, and no licensing restrictions.

Official Links