HTR with VLM

This handwritten text recognition pipeline uses Florence-2 fine-tuned for text line detection and OCR tasks. Steps in the pipeline:

This space does not have access to GPU. Inference on CPU will be extremely slow, so I cached example results to disk. Some notes:

To view example outputs, select one image from the examples, and choose Used cached result: True. To transcribe an example from scratch, choose False.
New images uploaded will be transcribed from scratch.