jlsutherland/doc2text: Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.

1256
STARS
39
WATCHERS
103
FORKS
15
ISSUES