Video Presentations

Welcome and Keynote Speaker session

Session 1. Evaluation and improvement of OCR

Session 2. Applications

Session 3. OCR and HTR in practise

Session 4. Digitisation of historical languages

Session 5. Access to data

Session 6. Natural language processing

Session 7. Metadata

Welcome and Keynote Speaker session

Frieda Steurs, INT

Organisers' welcome

Reinhard Altenhöner, SBB

The next exercise for libraries: data enrichment and analysis as a key technology for new tasks and offerings

Session 1. Evaluation and improvement of OCR

Matthias Boenig, Konstantin Baierer, Volker Hartmann, Maria Federbusch and Clemens Neudecker

Labelling OCR Ground Truth for Usage in Repositories

Anna-Maria Sichani, Panagiotis Kaddas, George K. Mikros and Basilis Gatos

OCR for Greek polytonic (multi accent) historical printed documents: development, optimization and quality control

Tobias Englmeier, Florian Fink and Klaus Schulz

A-I-PoCoTo - Combining Automated and Interactive Postcorrection of OCR results

Session 2. Applications

Emad Mohamed and Zeeshan Ali Sayyed

Arabic-SOS: Segmentation, Stemming, and Orthography Standardization for Classical and pre-Modern Standard Arabic

Christian Reul, Sebastian Göttel, Uwe Springmann, Christoph Wick, Kay-Michael Würzner and Frank Puppe

Automatic Semantic Text Tagging on Historical Lexica by Combining OCR and Typography Classification

Juri Opitz, Leo Born, Vivi Nastase and Yannick Pultar

Automatic Reconstruction of Emperor Itineraries from the Regesta Imperii

Karin Hofmeester, Ashkan Ashkpour, Katrien Depuydt and Jesse de Does

Diamonds in Borneo: Commodities as Concepts in Context

Session 3. OCR and HTR in practise

Clemens Neudecker, Konstantin Baierer, Maria Federbusch, Kay-Michael Würzner, Matthias Boenig, Elisa Hermann and Volker Hartmann

OCR-D: An end-to-end open-source OCR framework for historical documents

Kimmo Kettunen, Teemu Ruokolainen, Erno Liukkonen, Pierrick Tranouez, Daniel Antelme and Thierry Paquet.

Detecting Articles in a Digitized Finnish Historical Newspaper Collection 1771–1929: Early Results Using the PIVAJ Software

Christian Clausner, Apostolos Antonacopoulos, Christy Henshaw and Justin Hayes

Towards the Extraction of Statistical Information from Digitised Numerical Tables - The Medical Officer of Health Reports Scoping Study

Arnau Baró, Jialuo Chen, Alicia Fornés and Beáta Megyesi

Towards a generic unsupervised method for transcription of encoded manuscripts

Session 4. Digitisation of historical languages

Thomas Milo and Alicia González Martínez

A New Strategy for Arabic OCR: Archigraphemes, Letter Blocks, Script Grammar, and shape synthesis

Senka Drobac, Pekka Kauppinen and Krister Lindén

Improving OCR of historical newspapers and journals published in Finland

Session 5. Access to data

Anne Gorter, Edwin Klijn, Rutger Van Koert, Marielle Scherer and Ismee Tames

Tribunal Archives as Digital Research Facility (TRIADO): new ways to make archives accessible and useable

Tom Derrick and Nora McGregor

Cross-disciplinary collaborations to enrich access to non-Western language material in the Cultural Heritage sector

Georg Rehm, Martin Lee, Julián Moreno Schneider and Peter Bourgonje.

Curation Technologies for a Cultural Heritage Archive: Analysing and transforming a heterogeneous data set into an interactive curation workbench

Evagelos Varthis, Marios Poulos, Ilias Yarenis and Sozon Papavlasopoulos

Implementation of a Databaseless Web REST API for the Unstructured Texts of Migne's Patrologia Graeca with Searching capabilities and additional Semantic and Syntactic expandability

Session 6. Natural language processing

Jeremi Ochab and Holger Essler

Stylometry of literary papyri

Sandra Young

Using lexicography to characterise relations between species mentions in the biodiversity literature

Session 7. Metadata

Péter Király

Validating 126 million MARC records

Katrien Depuydt and Hennie Brugman

Turning Digitised Material into a Diachronic Corpus: Metadata Challenges in the Nederlab Project

Google Sites

Report abuse