Show HN: Side-by-side PDF parser comparison for RAG pipelines

github.com

1 points by 2dogsanerd 41 minutes ago

A simple tool to compare how different PDF parsers handle your documents.

Shows naive parsing (pypdf) vs layout-aware parsing (Docling) side-by-side.

Helps spot issues with scans, tables, and multi-column layouts before theycause problems in your RAG system.

Parsers are easy to swap if you want to try alternatives.