The most comprehensive benchmark to date for evaluating document understanding capabilities of Vision-Language Models (VLMs).
What is it?
A unified evaluation suite covering 6 core IDP tasks across 16 datasets and 9,229 documents:
Each task uses multiple datasets, including real-world, synthetic, and newly annotated ones.
Highlights from the Benchmark
Why does this matter?
There’s currently no unified benchmark that evaluates all IDP tasks together — most leaderboards (e.g., OpenVLM, Chatbot Arena) don’t deeply assess document understanding.
Document Variety
We evaluated models on a wide range of documents: Invoices, forms, receipts, charts, tables (structured + unstructured), handwritten docs, and even diacritics texts.
Get Involved
We’re actively updating the benchmark with new models and datasets.
This is developed with collaboration from IIT Indore and Nanonets.
Leaderboard: https://idp-leaderboard.org/
Release blog: https://idp-leaderboard.org/details/
GithHub: https://github.com/NanoNets/docext/tree/main/docext/benchmark
Feel free to share your feedback!
Article URL: https://www.bloomberg.com/news/articles/2025-05-03/warren-buffett-to-step-down-from-berkshire-hathaway-at-year-end
Comments URL: https://news.ycombinator.com/item?id=43880973
Points: 341
# Comments: 265
Article URL: https://waymo.com/blog/2025/04/waymo-and-toyota-outline-strategic-partnership
Comments URL: https://news.ycombinator.com/item?id=43839123
Points: 381
# Comments: 356