HTR text recognition

Description

Similar to the MNIST digits recognition project, this initiative addresses the more challenging problem of handwritten text recognition, requiring the model to identify not only individual characters but also their correct sequential order.

The system was trained on the IAM Handwriting Database, with data preprocessing implemented in Python. Using OpenCV, the pipeline first detects text lines and then segments them into individual words. Model development and training were performed with PyTorch and PyTorch Lightning, employing a Convolutional Recurrent Neural Network (CRNN) architecture. This design combines a convolutional neural network for feature extraction with a recurrent neural network to handle sequential data. Training is guided by the Connectionist Temporal Classification (CTC) loss function, which enables the model to learn alignments between input images and their corresponding text sequences.

The full source code and accompanying report are available on Gitlab.


This project was made for my university course called “Intelligent virtualisation of systems and process automation” at Wrocław University of Science and Technology.