CREMMA Medii Aevi: Literary manuscript text recognition in Latin - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année :

CREMMA Medii Aevi: Literary manuscript text recognition in Latin

(1, 2, 3) , (1, 3) , (4, 5)
1
2
3
4
5

Résumé

This paper present a novel segmentation and handwritten text recognition dataset for Medieval Latin, from the 11 th to the 16 th century. It connects with Medieval French dataset as well as ealier Latin dataset, by enforcing common guidelines. We provide our own addition to Ariane Pinche's Old French guidelines to deal with specific Latin case. We also offer an overview of how we addressed this dataset compilation through the use of pre-existing resources. With a higher abbreviation ratio and a better representation of abbreviating marks, we offer new models that outperform the base Old French model on Latin dataset, reaching readability levels on unknown manuscripts.
Fichier principal
Vignette du fichier
JOHD_Cremma_Medii_Aevi.pdf (1.64 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03828353 , version 1 (25-10-2022)
hal-03828353 , version 2 (02-11-2022)
hal-03828353 , version 3 (29-11-2022)
hal-03828353 , version 4 (11-01-2023)

Identifiants

  • HAL Id : hal-03828353 , version 4

Citer

Thibault Clérice, Malamatenia Vlachou-Efstathiou, Alix Chagué. CREMMA Medii Aevi: Literary manuscript text recognition in Latin. 2023. ⟨hal-03828353v4⟩
0 Consultations
0 Téléchargements

Partager

Gmail Facebook Twitter LinkedIn More