Kreuzberg

Kreuzberg is a Python library for text extraction from documents. It provides a unified async interface for extracting text from PDFs, images, office documents, and more.

Cost / License

  • Free
  • Open Source

Platforms

  • Python
  • Linux
  • Mac
  • Windows
-
No reviews
0likes
0comments
0alternatives
0news articles

Features

Suggest and vote on features
  1.  OCR
  2.  Python-based
  3.  Extract text from image

 Tags

  • pdf-text-extractor
  • retrieval-augmented-generation
  • Extract text
  • python-lib
  • pandoc
  • python-library
  • tesseract-ocr
  • tesseract

Kreuzberg News & Activities

Highlights All activities

Recent activities

Show all activities

Kreuzberg information

  • Developed by

    Na'aman Hirschfeld
  • Licensing

    Open Source (MIT) and Free product.
  • Alternatives

    0 alternatives listed
  • Supported Languages

    • English

AlternativeTo Category

Office & Productivity

GitHub repository

  •  2,580 Stars
  •  114 Forks
  •  8 Open Issues
  •   Updated  
View on GitHub
Kreuzberg was added to AlternativeTo by Paul on and this page was last updated .
No comments or reviews, maybe you want to be first?
Post comment/review

What is Kreuzberg?

Kreuzberg is a Python library for text extraction from documents. It provides a unified async interface for extracting text from PDFs, images, office documents, and more.

Why Kreuzberg?

  • Simple and Hassle-Free: Clean API that just works, without complex configuration
  • Local Processing: No external API calls or cloud dependencies required
  • Resource Efficient: Lightweight processing without GPU requirements
  • Small Package Size: Has few curated dependencies and a minimal footprint
  • Format Support: Comprehensive support for documents, images, and text formats
  • Modern Python: Built with async/await, type hints, and functional first approach
  • Permissive OSS: Kreuzberg and its dependencies have a permissive OSS license

Kreuzberg was built for RAG (Retrieval Augmented Generation) applications, focusing on local processing with minimal dependencies. It's designed for modern async applications, serverless functions, and dockerized applications.

Official Links