MEZANNO

#Textual corpora annotation #Human-in-the-loop AI #Digital humanities #Historical document processing

The Image Processing and Pattern Recognition group participates in the MEZANNO project, which also involves the National Library of France (BnF), the French National Institute of Geographic and Forest Information (IGN), and the School for Advanced Studies in the Social Sciences (EHESS). Its goal is to facilitate the collaborative annotation of textual resources published through the IIIF API, with a particular focus on enhancing the Mirador viewer.

Driven by recent advances in artificial intelligence, cross-platform user interfaces, and web service deployment, the project seeks to address a pressing need in the social sciences: enabling researchers to efficiently query, organize, and build research objects from large-scale digitized archival collections. While historical corpora such as censuses, directories, land use records, or official publications can now be processed semi-automatically, the lack of appropriate tools still limits their exploitation, especially when dealing with massive datasets.

The Mezanno consortium aims to provide a suite of open, interoperable tools to support three core stages of the research process:

  • Corpus construction, by assembling resources via the IIIF standard,
  • Raw data extraction, assisted by AI modules for transcription and content recognition,
  • Data structuring, based on user-defined models, with export formats that ensure interoperability.

The project is also committed to fostering a community of users and contributors to support long-term adoption and collaborative development.

Partners