Cuneiform annotator and cuneiform character classification

Cuneiform annotator and cuneiform character classification Shirley Sidharta, CC BY SA 4.0

The performance of machine learning applications is largely based on the quality of the training data that the machine learning algorithm processes. In the field of character recognition on cuneiform tablets, classification is dependent on training data, which is usually created by specialised scientists. The project aims to provide tools for the creation and transformation of annotations on 2D and 3D media, to develop annotation standards for the representation of annotation types that do not yet exist, to create a training data set for the use case of cuneiform writing and then to use this for classifications.


Originally developed as a prototype in 2021 based on the annotation tool Annotorious, the Cuneur: Cuneiform Annotator tool, which was previously only used internally in the Haft Tappeh project, was further developed in various application contexts. In cooperation with researchers from the Belgian Cune-IIIF-orm project, the tool was tested and adapted to the needs of cuneiform research. The resulting findings have been published in the itit Journal.

A cuneiform image corpus(Hilprechtsammlung, HeiCuBeDa) already annotated with the Cuneur in 2022 was published in 2023 as a separate machine learning dataset under the name MaiCuBeDa. Based on this dataset, two different classifications were created in 2023, both of which were presented at conferences. One of them was honoured with the Best Paper Award at the GCH2023 conference.

A tool developed in this context enables the transfer of annotations created on 2D renderings to photos of the same side of the cuneiform tablet. Over the course of the year, efforts to standardise 3D annotations and implement prototypes for 3D annotations were started, which will be discussed in the new year in the context of NFDI4Objects.