Skip to main navigation Skip to search Skip to main content

From Scribbles to Text: A Novel Transformer-Based Recognition Model for Child Handwriting

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Handwritten Text Recognition (HTR) remains a challenging task, particularly for child handwriting, which often exhibits irregular letter formation, letter crowding, mirrored letters, and phonological spelling errors. The presence of these characteristics is rare in adult handwriting, which forms the primary training data for most existing HTR systems and multimodal large language models (MLLMs). Consequently, current models often autocorrect or misinterpret these features, limiting their effectiveness in contexts where accurate handwritten text recognition is crucial. This is particularly problematic for identifying specific learning disabilities (SLDs) like dyslexia and dysgraphia, where features such as letter reversals, inversions, and spelling mistakes are key diagnostic indicators. To address this gap, we introduce Extended-TrOCR (E-TrOCR), an adaptation of the transformer-based optical character recognition (TrOCR) model specifically designed for child handwriting. E-TrOCR uses a two-stage training process, starting with the IAM dataset for general handwriting recognition, followed by fine-tuning on a dedicated child handwriting dataset. The model employs character-level tokenization to prevent autocorrection and introduces a novel 220-alphabet to represent letter reversals and inversions. Trained on over 1,800 text lines from elementary school students, E-TrOCR significantly outperforms state-of-the-art HTR models, underscoring the necessity of dedicated solutions for child handwriting recognition.

Original languageEnglish
Title of host publicationDocument Analysis and Recognition - ICDAR 2025 - 19th International Conference, Proceedings
EditorsXu-Cheng Yin, Dimosthenis Karatzas, Daniel Lopresti
PublisherSpringer Science and Business Media Deutschland GmbH
Pages115-131
Number of pages17
ISBN (Print)9783032046130
DOIs
StatePublished - 2026
Event19th International Conference on Document Analysis and Recognition, ICDAR 2025 - Wuhan, China
Duration: Sep 16 2025Sep 21 2025

Publication series

NameLecture Notes in Computer Science
Volume16023 LNCS

Conference

Conference19th International Conference on Document Analysis and Recognition, ICDAR 2025
Country/TerritoryChina
CityWuhan
Period09/16/2509/21/25

Keywords

  • Child Handwriting
  • Handwritten Text Recognition
  • Multimodal Large Language Models
  • Transformers

Fingerprint

Dive into the research topics of 'From Scribbles to Text: A Novel Transformer-Based Recognition Model for Child Handwriting'. Together they form a unique fingerprint.

Cite this