allenai

olmocr

Toolkit for linearizing PDFs for LLM datasets/training

Trusted Project Python
18.1k stars 295 stars today via trending-daily
View on GitHub
← Back to Trending Repos