Digital editing of the cuneiform texts from Haft Tappeh

Finished

Cuneiform texts are transliterated and made available digitally Vanessa Liebler for the i3mainz, CC BY SA 4.0

The project Digital editing of the cuneiform texts from Haft Tappeh is dedicated to the transliteration and digital availability of more than 600 cuneiform texts from Haft Tappeh, Iran. The aim of the project is the elaboration and further development of a digital workflow, taking into account existing tools, international standards and computer linguistic analysis methods.

Motivation

Haft Tappeh (the ancient city of Kabnak) is located in southwestern Iran in the province of Khuzestan, about 15 km southeast of the ancient city of Susa. Its geographical position made Haft Tappeh an important site in Bronze Age history and culture. To date, large-scale excavations have uncovered more than 1,400 text fragments of cuneiform tablets in the Babylonian language, along with the architectural remains of a palace. Most of them are administrative documents, whose final linguistic editing is still pending.

In the first phase of the project, funded by the German Research Foundation (DFG), the 600 to 650 texts excavated by Behzad Mofidi-Nasrabadi of Johannes Gutenberg University Mainz will be digitally edited and prepared for machine reading. The processing with contemporary methods and the open availability of the results intend to enable the investigation of paleography, lexis, syntax, tablet formats, text categories, bureaucratic protocol and modus operandi of the important text corpus beyond the confines of Ancient Near East Studies.

For this purpose, the i3mainz is developing a digital workflow for cuneiform tablets that starts with the existing 3D data and photographs of the tablets and digitally processes the contents using transliteration and computer-linguistic and semantic annotation. The focus here is not on the creation of a new portal to make cuneiform data available, but rather on the production of FAIR data that can be integrated into other existing repositories or those under construction. The acronym FAIR stands for findable, accessible, interoperable, reusable, and it refers to the internationally accepted principles for making research data available. The tools developed are made available in a Git repository, making it easier to replicate the workflows in other projects. Consequently, not only the data is findable, accessible, interoperable and usable, the software is as well.

Activities

In February 2021, colleagues from the fields of Assyriology and Computational Linguistics who deal with digital editions, their infrastructural prerequisites, philological and linguistic requirements and concepts relevant to information science exchanged ideas at the DFG-funded virtual workshop “Status quo and current developments in digital cuneiform editions” organised by i3mainz. The aim of the workshop was to jointly develop solution strategies for the digital edition of cuneiform texts.

It was preceded by a workshop for young scholars with almost 60 participants from the Initiative for Digital Cuneiform Studies (IDCS), which was founded for this purpose. It was organised by the project members Eva Huber, Tim Brandes and Timo Homburg and funded by the programme “Small Subjects - Visibly Innovative” of the German Rectors’ Conference. The proceedings of the workshop were submitted for publication in the journal of the Cuneiform Digital Library Initiative at the end of 2021.

A number of other networking activities followed, including a coordination meeting with the Cuneiform Digital Library Initiative (CDLI) to agree details on the provision of the digital edition data from the Haft Tappeh project. By the end of 2021, the image data had been consolidated to such an extent that their transfer to the repository of Heidelberg University Library could be prepared. This was based on the metadata schema designed by a cross-project team of the i3mainz to document the creation process of 3D objects.

Initiated by a workshop of the CDLI, a deeper cooperation with regard to a formalisation of cuneiform palaeography has been taking place since August 2021. The impetus came from the publication in October 2021 of PaleoCodage, an encoding system that had already been developed in a previous project phase. These formalisations are not only interesting for the CDLI, but also for the W3C Ontolex-Lemon Working Group, especially its subgroup Multimodality. This group is concerned with the formalisation of dictionaries for different languages and writing systems and would like to develop a data model for the representation of writing systems across language boundaries.

Using the example of around 30,000 cuneiform tablets from the Hilprecht Collection, the Haft Tappeh team tested how annotations can be realised on two-dimensional image media. The result is a web application under development which, as “CuneiformAnnotator”, not only provides a basis for integrating these technologies into the “CuneiformWorkbench”, but also enables the classification of various cuneiform characters. The tool was tested in a crowdsourcing process via the platform Zooniverse as part of courses in Ancient Oriental Studies and by external cooperation partners. At the NFDI4Culture Plenary in November 2021, the Haft Tappeh team presented the concept of annotations on 3D models in a short talk entitled “Rich and sustainable annotations on 3D objects”.

A practical project within the framework of the inter-university Master’s programme Digital Methodology in the Humanities and Cultural Studies was dedicated to the back-projection of already existing 2D to 3D annotations. The results of the practical project are currently under review in the CDLI Journal and the Journal for Open Data in Archaeology (JOAD). At the Linked Pasts VII Symposium in December 2021, the ontology model of the Haft Tappeh project was presented in a poster contribution.

A second practical project was dedicated to the development of similarity metrics on 3D models of cuneiform tablets. The aim was to create a digital fingerprint of the 3D scans, which can be compared with objects in larger data repositories. The similarity to so-called reference bodies such as standard spheres or cuboids was measured and related to the 3D scans. Combined with features such as find locations or text content, the scans can be taken into account for further insights.

The 2022 project year was characterised by the expansion of international collaborations, for example with the Cuneiform Digital Library Initiative(CDLI). As part of the Google Summer Of Code, students developed technologies for image annotation. The highlight was the Securing Data in Mesopotamia: New Technologies for Secured Cuneiform Texts workshop in Leiden in March. Kai-Christian Bruhn and Timo Homburg presented the results of the Haft Tappeh project and discussed the prospects for future cooperation with those present.

Together with the international DANES Network for Digital Cuneiform Research, founded in 2022, members of the Haft Tappeh Project 2023 developed the data model for the representation of the digital paleography of the Haft Tappeh tablets in Wikidata

In the summer of 2023, another scanning campaign took place in Iran to scan cuneiform tablets that had not yet been recorded in 3D and to document them within the project period.

The existing corpus was examined for personal names and other content and enriched with the published archaeological data. The former were published in the FactGrid database as linked open data. As part of the preparations for the publication of the corpus of annotated cuneiform panel paintings, MaiCuBeDa, an annotation dataset from the Hilprecht Collection, was published in the summer.

Images

Photo of the cuneiform tablet HT 05-13-128 i3mainz, CC BY SA 4.0

Photo of the cuneiform tablet HT 05-13-128 i3mainz, CC BY SA 4.0
Rendering of the front of the cuneiform tablet HT 05-13-128 i3mainz, CC BY SA 4.0

Rendering of the front of the cuneiform tablet HT 05-13-128 i3mainz, CC BY SA 4.0
Transliteration of the cuneiform tablet HT 05-13-128 with marking of personal names i3mainz, CC BY SA 4.0

Transliteration of the cuneiform tablet HT 05-13-128 with marking of personal names i3mainz, CC BY SA 4.0
Location of Haft Tappeh i3mainz, CC BY SA 4.0

Location of Haft Tappeh i3mainz, CC BY SA 4.0

Duration

01.09.2019 – 31.08.2025

Funding

This project is funded by Deutsche Forschungsgemeinschaft (DFG)

In collaboration with

Institut für Ägyptologie und Altorientalistik, JGU Mainz

Participants

Kai-Christian Bruhn

Timo Homburg
until 2025
Lukas Ahlborn
until 2025

External Participants

Prof. Dr. Doris Prechel, Institut für Altertumswissenschaften Altorientalische Philologie, Johannes Gutenberg-Universität Mainz

ao.altertumswissenschaften.uni-mainz.de/univ-prof-dr-doris-prechel/

Prof. Dr. Behzad Mofidi-Nasrabadi, Institut für Altertumswissenschaften Vorderasiatische Archäologie, Johannes Gutenberg-Universität Mainz

vorderasiatische-archaeologie.uni-mainz.de/pd-dr-behzad-mofidi-nasrabadi/

Tim Brandes, Institut für Altertumswissenschaften Altorientalische Philologie, Johannes Gutenberg-Universität Mainz

ao.altertumswissenschaften.uni-mainz.de/tim-brandes-m-a/

Eva Huber, Institut für Altertumswissenschaften Altorientalische Philologie, Johannes Gutenberg-Universität Mainz

ao.altertumswissenschaften.uni-mainz.de/eva-maria-huber/

Ali Zalaghi, Institut für Altertumswissenschaften Altorientalische Philologie, Johannes Gutenberg-Universität Mainz

ao.altertumswissenschaften.uni-mainz.de/personen/

Jun.-Prof. Dr. Hubert Mara

orcid.org/0000-0002-2004-4153

Images

Photo of the cuneiform tablet HT 05-13-128 i3mainz, CC BY SA 4.0

Photo of the cuneiform tablet HT 05-13-128 i3mainz, CC BY SA 4.0
Rendering of the front of the cuneiform tablet HT 05-13-128 i3mainz, CC BY SA 4.0

Rendering of the front of the cuneiform tablet HT 05-13-128 i3mainz, CC BY SA 4.0
Transliteration of the cuneiform tablet HT 05-13-128 with marking of personal names i3mainz, CC BY SA 4.0

Transliteration of the cuneiform tablet HT 05-13-128 with marking of personal names i3mainz, CC BY SA 4.0
Location of Haft Tappeh i3mainz, CC BY SA 4.0

Location of Haft Tappeh i3mainz, CC BY SA 4.0

Publications

MaiCuBeDa HT
Ahlborn, Lukas; Homburg, Timo

heiDATA (Hrsg). Cuneiform Benchmark Datasets. Heidelberg. 2025
T.i.A.M.A.T: A Tool for Metadatation in Cultural Heritage Photogrammetry Acquisition
Lauro, Vittorio; Weber, Ann-Kathrin; Homburg, Timo et al.

2025 IEEE International Conference on Cyber Humanities (IEEE-CH). Florence, Italy: IEEE 2025 S. 1 - 8
The Mainz Cuneiform Benchmark Dataset Series: Sign Annotations of 3D Rendered Tablets
Homburg, Timo; Ahlborn, Lukas; Bruhn, Kai-Christian et al.

Journal of Open Archaeology Data. Bd. 172. London: ubiquity press 2025
Insights into digital ancient near Eastern studies: introduction
Gordin, Shai; Béranger, Marine; Homburg, Timo et al.

it - Information Technology. Bd. 66. H. 1. Walter de Gruyter 2024 S. 1 - 3
Insights into Digital Ancient Neareastern Studies (iDANES), Part 1
Gordin, Shai; Béranger, Marine; Homburg, Timo et al.

Berlin: DeGruyter 2024
Insights into Digital Ancient Neareastern Studies (iDANES), Part 2
Gordin, Shai; Béranger, Marine; Homburg, Timo et al.

Berlin: DeGruyter 2024
PaleOrdia: Semantically Describing (Cuneiform) Paleography using Paleographic Linked Open Data.
Homburg, Timo

SemDH@ESWC. 2024
Preparing multi-layered visualisations of Old Babylonian cuneiform tablets for a machine learning OCR training model towards automated sign recognition
Hendrik, Hameeuw; De Graef, Katrien; Ryberg Smidt, Gustav et al.

Information Technology (it). Bd. itit-2023-0063. 2024 S. 1 - 15
Towards the Integration of Cuneiform in the OntoLex-Lemon Framework
Homburg, Timo; Declerck, Thierry

Haralambous, Yannis (Hrsg). Grapholinguistics in the 21st Century, Part 1. 2024 S. 265 - 297
CNN Based Cuneiform Sign Detection Learned from Annotated 3D Renderings and Mapped Photographs with Illumination Augmentation
Stötzner, Ernst; Homburg, Timo; Mara, Hubert

Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops. Paris. 2023 S. 1680 - 1688
From an Analog to a Digital Workflow: An Introductory Approach to Digital Editions in Assyriology
Homburg, Timo; Brandes, Tim; Huber, Eva-Maria et al.

CDLI Bulletin. Bd. 2023. H. 4. 2023 S. 1 - 30
MaiCuBeDa Hilprecht: Mainz Cuneiform Benchmark Dataset for the Cuneiform Texts from the Hilprecht Collection of Babylonian Antiquities
Homburg, Timo; Mara, Hubert

Universitätsbibliothek Heidelberg. 2023
R-CNN based PolygonalWedge Detection Learned from Annotated 3D Renderings and Mapped Photographs of Open Data Cuneiform Tablets
Stötzner, Ernst; Homburg, Timo; Bullenkamp, Jan Ph, et al.

Eurographics Workshop on Graphics and Cultural Heritage. Salento. 2023 S. 1 - 15
3D Data Derivatives of the Haft Tappeh Processing Pipeline
Homburg, Timo; Zwick, Robert; Bruhn, Kai-Christian et al.

Cuneiform Digital Library Journal. Bd. 9. H. 1. Cuneiform Digital Library Initiative 2022 CDLJ 2022:1
Annotated 3D-Models of Cuneiform Tablets
Homburg, Timo; Zwick, Robert; Mara, Hubert et al.

Journal of Open Archaeology Data. Bd. 10. Ubiquity Press, Ltd. 2022 4
PaleoCodage—Enhancing machine-readable cuneiform descriptions using a machine-readable paleographic encoding
Homburg, Timo

Digital Scholarship in the Humanities. Bd. 36. H. Supplement_2. Oxford University Press (OUP) 2021 S. 127 - 154