Big Data in Text Recognition and Image Publication
15.06.2026 H15:00-20:00
16.06.2026 H09:00-12:30
15.06.2026 H15:00-20:00
16.06.2026 H09:00-12:30
REGISTRATION HERE TO PARTICIPATE 15.06.2026
REGISTRATION HERE TO PARTICIPATE 16.06.2026
The workshop is organised in collaboration with DHCH, SCOOP (Source Code of the Past), e-Codices ORD, the Flow (SNSF-Project)
Digitisation of cultural heritage remains a key and long-standing focus within the Digital Humanities. Over the past two decades, initiatives like e-codices and national libraries, such as the Bibliothèque Nationale de France, have made their historical (hand-)written documents accessible to the public. This access has not only stimulated paleographic and codicological research but has also enabled significant progress in machine learning areas, particularly automatic text recognition (and indirectly guided progress with visual language models).
While current state-of-the-art methods allow us to process large volumes of documents with good results, this data reuse has only marginally advanced the methodologies in the humanities, which primarily focus on qualitative questions. This year’s DHCH workshop aims to facilitate discussions among researchers, GLAM specialists, digitization experts, and the wider DHCH community. With the goal to support the next generation not only to profit from the developments but also to rethink approaches to the material of the past.

PROGRAMME:
DAY 1
H15:00-17:00Welcome and Project Pitching round by Participants
H17:00-18:00Keynote lecture
Presenting and describing manuscripts (e-codices)
Dr William Duba
H18:00-19:00 Keynote lecture
Transcribing and Analysing Digitally
Dr Katarzyna Anna Kapitan
H19:00-20:00 Wrap-up Day 1
DAY 2
H09:00-09:15 Welcome
H09:15-10:45 Keynote lecture
Automatic Pre-Editorialisation
Dr Simon Gabay
H10:45-11:00 Coffee Break
H11:00-12:30 Keynote lecture
Semi-automatic Annotation of Complex Textual Structures: The Semper Case
Dr Elena Chestnova
Dr William Duba (Ph.D., History, University of Iowa, 2006; Habilitation, Philosophy, University of Fribourg, 2017), is a specialist in medieval intellectual history, history of philosophy and theology, and fragmentology. His work with medieval manuscripts brought him back to Fribourg in 2016 to manage the Fragmentarium: Laboratory for Medieval Manuscript Fragments project. Since 2022, he has also been responsible for coordinating e-codices: Virtual Manuscript Library of Switzerland for the University of Fribourg.
Dr Simon Gabay studied at Paris IV-Sorbonne and the University of St Andrews before completing a PhD in Latin philology at the University of Amsterdam (UvA), with a dissertation on the history of the actor in the Middle Ages. He then pursued postdoctoral research in computational philology as part of a Swiss National Science Foundation (SNSF) project on the manuscripts of Sévigné at the University of Neuchâtel, where he established the first courses in digital humanities. In 2020, he joined the University of Geneva as a Maître-assistant under the chair of Béatrice Joyeux-Prunel, where he leads the FreEM project (resources and applications for Classical French) and FoNDUE (information extraction from historical documents). He has previously led projects such as Katabase (on the late 19th-century manuscript market) and Galli(corpor)a (tools for restructuring historical documents from Gallica). His research focuses on early modern French philology, classical French literature, natural language processing, and optical character recognition.
Dr Katarzyna Anna Kapitan holds a Junior Professor Chair in « Computational Analysis of Western Written Culture » at École nationale des chartes – PSL Université (Paris, France). Attached to the Centre Jean-Mabillon for her research activities, Katarzyna teaches and supervises students of the two MA degrees in digital humanities run by the ENC – PSL. Her research spans various aspects of Nordic culture (medieval and early modern), including manuscript studies, book history, textual criticism, transmission and reception history, and history of historiography.
Dr Elena Chestnova « I started out as an architect, moved over into history with a specific interest in material culture and into DH via digital editions. These days I lead the Gottfried Semper Edition, teach theory and history of architecture, lead two small projects on open research data (one on APIs for editions and one on semantic ontologies for text) and develop a large study of textual corpora with the aim of investigating public opinions about architecture and urban growth in the (very) long nineteenth century. »
SAVE THE DATE
Inscrivez-vous à cet événement pour recevoir une notification par e-mail