Updating marker-pdf
Introduction
marker-pdf had several new updates and improvements over the past months, including a new major release. I'm upgrading to the new version.
Among the changes:
- Improve performance by 10-15% (0.3.10)
- 2x faster due to a new layout model (1.0.0)
- Consistent internal schema for blocks and pages (1.0.0)
- Much higher quality output (1.0.0)
- Fix lots of misc bugs, including encoding, empty page problems, and image rendering (1.0.1)
- Improve list processing with joining and nesting (1.0.1)
- Add in blockquotes (1.0.1)
- Slightly improve performance (1.0.1)
- Automatically detect bad OCR text and re-OCR the document. This consists of some PDF-level heuristics and a new OCR quality model. (1.2.0)
- Layout model is now half the size and ~2x faster (most of the runtime in the general case is layout, so this should result in a big overall speedup). It's also more accurate. (1.2.0)
- Tables now handle colspans and rowspans properly (1.3.0)
- Improved table model with better accuracy (1.3.0)
- Links and references are now pulled out of the pdf, and are clickable (1.3.0)
- Anchors are placed on elements as targets (1.3.0)
- Better inline math detection with an improved model. (1.6.0)
Lot's of good things for almost no cost!

