Mistral OCR 3 delivers major leap in accuracy and efficiency for text and image extraction
Mistral has introduced OCR 3, setting a new standard for accuracy and efficiency in document processing. The updated model extracts both text and embedded images from a wide range of documents with high fidelity. It introduces Markdown output, along with HTML-based reconstruction of tables, allowing downstream systems to preserve document structure as well as content.
Mistral OCR 3 surpasses its predecessor, achieving a 74% win rate over Mistral OCR 2 on challenging domains such as forms, scanned documents, complex tables, and handwriting. The model significantly improves interpretation of cursive, mixed-content annotations, handwritten entries layered on printed forms, as well as detection of boxes, labels, and dense layouts. Robustness has also increased, offering greater resistance to compression artifacts, skew, document distortion, low dots per inch, and background noise. Mistral OCR 3 enables advanced reconstruction of tables. It now supports recognition of headers, merged cells, multi-row blocks, and column-based hierarchies.
The updated AI model is accessible through API integration or the updated Document AI user interface in Mistral AI Studio, delivering instant extraction as plain text or structured JSON for developers and business users alike. This version delivers upgrades across all languages and document types. Pricing begins at $2 per 1,000 pages, dropping to $1 in batch mode. Mistral OCR 3 is fully backward compatible with OCR 2.
Comments
Impressive results, but Mistral's OCR models are not open and so, cannot be used offline.
Tesseract, Donut and such are still the way to go for that.