Kreuzberg
Like
Kreuzberg is a Python library for text extraction from documents. It provides a unified async interface for extracting text from PDFs, images, office documents, and more.
Cost / License
- Free
- Open Source
Platforms
- Python
- Linux
- Mac
- Windows
Kreuzberg
Like
Features
- OCR
- Python-based
- Extract text from image
Tags
- pdf-text-extractor
- retrieval-augmented-generation
- Extract text
- python-lib
- pandoc
- python-library
- tesseract-ocr
- tesseract
Kreuzberg information
No comments or reviews, maybe you want to be first?
Post comment/reviewWhat is Kreuzberg?
Kreuzberg is a Python library for text extraction from documents. It provides a unified async interface for extracting text from PDFs, images, office documents, and more.
Why Kreuzberg?
- Simple and Hassle-Free: Clean API that just works, without complex configuration
- Local Processing: No external API calls or cloud dependencies required
- Resource Efficient: Lightweight processing without GPU requirements
- Small Package Size: Has few curated dependencies and a minimal footprint
- Format Support: Comprehensive support for documents, images, and text formats
- Modern Python: Built with async/await, type hints, and functional first approach
- Permissive OSS: Kreuzberg and its dependencies have a permissive OSS license
Kreuzberg was built for RAG (Retrieval Augmented Generation) applications, focusing on local processing with minimal dependencies. It's designed for modern async applications, serverless functions, and dockerized applications.