docext

Like

docext is a powerful tool for extracting structured information from documents such as invoices, passports, and other forms. It leverages vision-language models (VLMs) to accurately identify and extract both field data and tabular information from document images.

docext screenshot 1

Cost / License

Free
Open Source (Apache-2.0)

Origin

United States

Platforms

Self-Hosted
Docker
Python

docext alternatives

0likes

0comments

0articles

Features

OCR
Structured data
PDF OCR
REST API
Python-based
On-premises software

Tags

docext News & Activities

Highlights All activities

Recent activities

johnbbab added docext as alternative to Graflows
5 days ago
JTranslate added docext as alternative to DocMap
23 days ago
daminmutti added docext as alternative to ZeraBooks
about 2 months ago
daende added docext as alternative to manyparse
8 months ago
POX added docext as alternative to Docparser, ExtractTable.com, ABBYY FlexiCapture and PDF Tables + 6 similar activities
11 months ago
POX added Structured data as a feature to docext
11 months ago
POX added docext
11 months ago

docext information

Developed by
NanoNets
Licensing
Open Source (Apache-2.0) and Free product.
Written in
Python
Alternatives
12 alternatives listed
Supported Languages
- English

AlternativeTo Categories

Development, Office & Productivity

GitHub repository

1,836 Stars
136 Forks
21 Open Issues
Updated Aug 25, 2025

Popular alternatives

docext was added to AlternativeTo by Paul on Apr 8, 2025 and this page was last updated Apr 8, 2025.

No comments or reviews, maybe you want to be first?

What is docext?

docext is a powerful tool for extracting structured information from documents such as invoices, passports, and other forms. It leverages vision-language models (VLMs) to accurately identify and extract both field data and tabular information from document images.

Features:

User-friendly interface: Built with Gradio for easy document processing
Flexible extraction: Define custom fields or use pre-built templates
Table extraction: Extract structured tabular data from documents
Confidence scoring: Get confidence levels for extracted information
On-premises deployment: Run entirely on your own infrastructure
Multi-page support: Process documents with multiple pages
REST API: Programmatic access for integration with your applications
Pre-built templates: Ready-to-use templates for common document types:
Invoices
Passports
Add/delete new fields/columns for other templates.

Official Links

AppStores & Other Links

Social Networks