Accepted papers

DATeCH2019


DATeCH2019 proceedings are available at: https://dl.acm.org/doi/proceedings/10.1145/3322905

Oral presentations

  • Anna-Maria Sichani, Panagiotis Kaddas, George K. Mikros and Basilis Gatos. OCR for Greek polytonic (multi accent) historical printed documents: development, optimization and quality control

  • Anne Gorter, Edwin Klijn, Rutger Van Koert, Marielle Scherer and Ismee Tames. Tribunal Archives as Digital Research Facility (TRIADO): new ways to make archives accessible and useable

  • Arnau Baró, Jialuo Chen, Alicia Fornés and Beáta Megyesi. Towards a generic unsupervised method for transcription of encoded manuscripts

  • Bruno Bon and Laura Vangone. Challenges of Mass OCR-isation of Medieval Latin Texts in a Resource-Limited Project

  • Christian Clausner, Apostolos Antonacopoulos, Christy Henshaw and Justin Hayes. Towards the Extraction of Statistical Information from Digitised Numerical Tables - The Medical Officer of Health Reports Scoping Study

  • Christian Reul, Sebastian Göttel, Uwe Springmann, Christoph Wick, Kay-Michael Würzner and Frank Puppe. Automatic Semantic Text Tagging on Historical Lexica by Combining OCR and Typography Classification

  • Clemens Neudecker, Konstantin Baierer, Maria Federbusch, Kay-Michael Würzner, Matthias Boenig, Elisa Hermann and Volker Hartmann. OCR-D: An end-to-end open-source OCR framework for historical documents

  • Eliese-Sophia Lincke, Marco Büchler and Kirill Bulert. Optical Character Recognition for Coptic. A multi-source approach for scholarly editions

  • Emad Mohamed and Zeeshan Ali Sayyed. Arabic-SOS: Segmentation, Stemming, and Orthography Standardization for Classical and pre-Modern Standard Arabic

  • Evagelos Varthis, Marios Poulos, Ilias Yarenis and Sozon Papavlasopoulos. Implementation of a Databaseless Web REST API for the Unstructured Texts of Migne's Patrologia Graeca with Searching capabilities and additional Semantic and Syntactic expandability

  • Georg Rehm, Martin Lee, Julián Moreno Schneider and Peter Bourgonje. Curation Technologies for a Cultural Heritage Archive: Analysing and transforming a heterogeneous data set into an interactive curation workbench

  • Giuseppe Celano. Standoff Annotation for the Ancient Greek and Latin Dependency Treebank

  • Helmut Schmid. Deep Learning-Based Morphological Taggers and Lemmatizers for Annotating Historical Texts

  • Hsiang-An Wang and Pin-Ting Liu, Towards a Higher Accuracy of Optical Character Recognition of Chinese Rare Books in Making Use of Text Model

  • Jeremi Ochab and Holger Essler. Stylometry of literary papyri

  • Juri Opitz, Leo Born, Vivi Nastase and Yannick Pultar. Automatic Reconstruction of Emperor Itineraries from the Regesta Imperii

  • Karin Hofmeester, Ashkan Ashkpour, Katrien Depuydt and Jesse de Does. Diamonds in Borneo: Commodities as Concepts in Context

  • Katrien Depuydt and Hennie Brugman. Turning Digitised Material into a Diachronic Corpus: Metadata Challenges in the Nederlab Project

  • Kimmo Kettunen, Teemu Ruokolainen, Erno Liukkonen, Pierrick Tranouez, Daniel Antelme and Thierry Paquet. Detecting Articles in a Digitized Finnish Historical Newspaper Collection 1771–1929: Early Results Using the PIVAJ Software

  • Liviu Pop. Hidden Metadata in Plain Sights: Romanian Folklore Catalogues

  • Matthias Boenig, Konstantin Baierer, Volker Hartmann, Maria Federbusch and Clemens Neudecker. Labelling OCR Ground Truth for Usage in Repositories

  • Péter Király. Validating 126 million MARC records

  • Sandra Young. Using lexicography to characterise relations between species mentions in the biodiversity literature

  • Senka Drobac, Pekka Kauppinen and Krister Lindén. Improving OCR of historical newspapers and journals published in Finland

  • Thomas Milo and Alicia González Martínez. A New Strategy for Arabic OCR: Archigraphemes, Letter Blocks, Script Grammar, and shape synthesis

  • Tobias Englmeier, Florian Fink and Klaus Schulz. A-I-PoCoTo - Combining Automated and Interactive Postcorrection of OCR results

  • Tom Derrick and Nora McGregor. Cross-disciplinary collaborations to enrich access to non-Western language material in the Cultural Heritage sector

Posters

  • Ben Companjen, Peter Verhaar, Koenraad Donker van Heel, Ferdinand Harmsen and Juan José Archidona Ramírez. Piloting the Abnormal Hieratic Global Portal

  • Bijayananda Pradhan and Kotrayya Agadi. Big Data Application in Academic Libraries: status study

  • Błażej Betański, Mateusz Matela, Maciej Mikuła and Tomasz Parkoła. Text collation in the dataset of the sources of the old law

  • Catalina Maranduc, Ludmila Malahov and Mihaela Marin. Alignment of the Romanian Oldest New Testament

  • Catalina Maranduc, Victoria Bobicev and Roman Untilov. Syntactic Parser for Old and Regional Romanian

  • Francesco Gelati. Selective Harvester: Harvesting and Managing Archival Descriptions as XML-EAD files

  • Hadewijch Masure. Itinera Nova: an ambitious digitization and disclosure of the Leuven Bench of Aldermen archives

  • J. Nathanael Philipp and Maximilian Bryan. Evaluation of CNN architectures for text detection in historical maps

  • Jim Salmons and Timlynn Babitsky. #MAGAZINEgts and #dhSegment: Using a Metamodel Subgraph to Generate Synthetic Data of Under-Sampled Complex Document Structures for Machine-Learning

  • Kimmo Kettunen, Mika Koistinen and Jukka Kervinen. Tidying up the Mess – on a Way to Improved Quality in a Historical Finnish Newspaper and Journal Collection 1771-1910J

  • Marieke Meelen and Christopher Handy. Intelligent Agents and Genetic Algorithms for Tibetan and Chinese Tagging and Alignment

  • Mª Isabel Rodríguez Fidalgo and Adriana Paíno Ambrosio, Diego A Burgos. «Omnium scientiarum princeps Salmantica docet»: An immersive 360º experience

  • Roxanne Wyns and An Smets. International Image Interoperability Framework @ KU Leuven (Belgium). Current applications and future projects

  • Ryma Benabdelaziz, Djamel Gaceb and Mohammed Haddad. Word Spotting in Historical Handwritten document Images based on Texture features in Spatial Context

  • Sinai Rusinek and Nurit Greidinger. No Tabula Rasa: Digitizing Historical Newspapers here and now

  • Shu Jiun Chen. Semantic Enrichment of Linked Biography Data for Digital Humanities

  • Soumya Mohanty and David Smith. Alignment-Based Training for Detecting Reader Annotations in Printed Books

  • Vanessa Hannesschläger. “«Retro-editing»: The edition of an edition of the Karl Kraus legal papers"

  • Wouter Termont, Lorenz Demey and Hans Smessaert. First Steps Toward a Digital Database of Aristotelian Diagrams

  • Zdeněk Uhlíř, Olga Čiperová, Tomáš Klimek and Tomáš Psohlavec. The Fragmentation of the Contents of Historical Text Editions in the Manuscriptorium Digital Library Environment