Schedule
DATeCH2017
Satellite workshops
Tuesday, May 30th
8:30: Registration
10:00 – 16:00 The journey from physical to digital and advancements in culture heritage digitisation
9:00 – 18:00 TRACER tutorial for computational text reuse detection
13:00 – 17:00 TextGrid user workshop
18:15 – 19:30: GCDH Evening Lectures: Sara Tonelli (FBK-Trento): “NLP for Historical Content Analysis: Ongoing work and Open challenges” Link: http://www.gcdh.de/en/events/calendar-view/gcdh-evening-lectures-sara-tonelli-turin-nlp-historical-content-analysis-ongoing-work-and-open-challenges/
Wednesday, May 31st
8:30: Registration
9:00 – 16:30 Handwritten Text Recognition – Transkribus Workshop (project READ)
13:00 – 17:00 PoCoTo user workshop
9:00 – 17:00 IMPACT Members Meeting (only for IMPACT members)
Main conference
Thursday, June 1st
8:30: Registration
9:00 – 9:15: Conference Opening
9:15 – 10:45: Session 1. Transcription
(Chaired by Sinai Rusinek)
Jesper Zedlitz and Norbert Luttenberger. 750 Volunteers Transcribing 31,000 Pages with 8.5 million Entries Online – an Evaluation
Enrique Manjavacas and Peter Petre. Enabling Annotation of Historical Corpora in an Asynchronous Collaborative Environment
Manuel Burghardt and Sebastian Spanner. Allegro: User-centered Design of a Tool for the Crowdsourced Transcription of Handwritten Music Scores
Jesper Zedlitz and Norbert Luttenberger. Enhancing Human-Transcribed Records by Using OCR
10:45 – 11:15: Coffee break
11:15 – 13:05 Session 2. Natural Language Processing
(Chaired by Klaus Schulz)
Filip Graliński, Rafał Jaworski, Łukasz Borchmann and Piotr Wierzchoń. The RetroC challenge: how to guess the publication year of a text?
Catalina Maranduc, Cătălin Mititelu and Radu Simionescu. Parsing Romanian Specialized Dictionaries Structured in Nests
Markus Paluch, Gabriela Rotari, David Steding, Maximilian Weß, Maria Moritz and Marco Büchler. Analysis of part-of-speech tagging of historical German texts
Alessio Salomoni. Dependency Parsing on Late-18th-Century German Aesthetic Writings. A Preliminary Inquiry into Schiller and F. Schlegel.
Candela Gustavo, Maria Pilar Escobar Esteban and Borja Navarro-Colorado. In search of Poetic Rhythm: Poetry retrieval trough text and metre
13:05 – 13:15: DARIAH presentations
Mike Mertens. Dariah-EU.
Stefan Schmunck. Dariah-DE.
13:15 – 14:00: Lunch break
14:00 – 15:30: Session 3. OCR and Postprocessing
(Chaired by Neil Fitzgerald)
Florian Fink, Klaus U. Schulz and Uwe Springmann. Profiling of OCR’ed Historical Texts Revisited
Alicia González Martínez, Tillmann Feige and Thomas Eich. Clear-cut methodology for Arabic OCR and post-correction with low technical skilled annotators
Harald Hammarström, Shafqat Virk and Markus Forsberg. Poor Man’s OCR Post-Correction: Unsupervised Recognition of Variant Spelling Applied to a Multilingual Document Collection
Manuel Ayuso. OCR of a mixed corpus: early printings and manuscripts of Martianus Capella’s work
15:30 – 16:00: Coffee break
16:00 – 17:30 Session 4. Natural Language Processing on Latin and Greek
(Chaired by Greta Franzini)
Marco Budassi and Marco Passarotti. The Impact of Unassimilated Loanwords on Latin Lexicon. A Qualitative and Quantitative Analysis
Corien Bary, Peter Berck and Iris Hendrickx. A Memory-Based Lemmatizer for Ancient Greek
Herbert Lange. Implementation of a Latin Grammar in Grammatical Framework
Eleonora Litta, Marco Passarotti and Paolo Ruffolo. Node Formation. Using Networks to Inspect Productivity in Affixal Derivation in Classical Latin
17:30 – 18:15 Poster session
B1: C. Clausner, C. Papadopoulos, S. Pletschacher, A. Antonacopoulos. The ENP Image and Ground Truth Dataset of Historical Newspapers
B2: C. Papadopoulos, S. Pletschacher, C. Clausner, A. Antonacopoulos. The IMPACT Dataset of Historical Document Images
B3: Cătălina Mărănduc, Augusto Perez and Victoria Bobicev. Building a Corpus to Study the Historical and Geographical Variation of Romanian Language
B4: So Miyagawa, Kirill Bulert and Marco Büchler. Utilization of Common OCR Tools for Typeset Coptic Texts
B5: Markus Paluch, Franz Mertins, Simone Rebora, Gabriela Rotari, Christina Schmidt, Benedict Spermoser, Ronald Weller, Maximilian Weß and J. Berenike Herrmann. https://kolimo.uni-goettingen.de – Building a Corpus of Modernist Literary Texts
B6: Emily Franzini, Greta Franzini, Gabriela Rotari, Franziska Pannach, Mahdi Solhdoust, Marco Büchler. The digital breadcrumb trail of Brothers Grimm
B7: Jim Salmons and Timlynn Babitsky. The MAGAZINE #GTS format, an integrated document structure and content depiction model supporting eResearch and machine-learning at the Internet Archive
B8: Karen Thöle. Digital means for the presentation and evaluation of a 15th century liturgical book
B9: Maria Moritz, Marco Büchler. Non-Literal Text Reuse in Historical Texts: An Approach to Identify Reuse Transformations and its Application to Bible Reuse
B10: C. Clausner, S. Pletschacher, A. Antonacopoulos: Efficient OCR Training Data Generation with Aletheia
B11: Christian Clausner: Overview on a number of document analysis tools, ranging from ground truth production to performance evaluation.
19:00 Dinner
Friday, June 2nd
8:30: Registration
9:00 – 10:30 Session 5. Infrastructure and Linked Open Data
(Chaired by Tomasz Parkola)
Péter Király. Towards an extensible measurement of metadata quality
Christophe Onambélé, Matyáš Kopp, Marco Passarotti and Jiří Mírovský. Converting Latin Treebank Data into SQL Database for Query Purposes
Thierry Declerck and Lisa Schäfer. Porting past classification schemes for narratives to a Linked Data Framework
Simone Rebora. A Software Pipeline for the Reception of Italian Literature in Nineteenth-Century England. Preliminary Testing
10:30 – 10:45 Best Paper Award Ceromony
10:45 – 11:15 Coffee break
11:15 – 12:45 Session 6. Digitisation & Layout Analysis
(Chaired by Apostolos Antonacopoulos)
Christian Reul, Uwe Springmann and Frank Puppe. LAREX – A semi-automatic open-source Tool for Layout Analysis and Region Extraction on Early Printed Books
Svetlana Cojocaru, Malahov Ludmila and Alexandru Colesnicov. Digitization of Old Romanian Texts Printed in the Cyrillic Script
Christian Clausner, Justin Hayes, Apostolos Antonacopoulos and Stefan Pletschacher. Unearthing the Recent Past: Digitising and Understanding Statistical Information from Census Tables
Christian Reul, Marco Dittrich and Martin Gruner. Case Study of a highly automated Layout Analysis and OCR of an incunabulum: ‘Der Heiligen Leben’ (1488)
12:45 – 13:30 Lunch break
13:30 – 15:00 Session 7. Spatial Analysis
(Chaired by Marco Büchler)
Kimmo Kettunen and Teemu Ruokolainen. Names, Right or Wrong: Named Entities in an OCRed Historical Finnish Newspaper Collection
Rebecca Benefiel, Sara Sprenkle, Holly Sypniewski and Jamie White. Ancient Graffiti Project: Geo-Spatial Visualization and Search Tools for Ancient Handwritten Inscriptions
Gustavo Candela, Maria Pilar Escobar Esteban and Manuel Marco-Such. Semantic Enrichment on Cultural Heritage collections: A case study using geographic information
Mariona Coll Ardanuy and Caroline Sporleder. Weakly-supervised toponym disambiguation in historical documents using semantic and geographic features
15:00 – 15:30: Coffee break
15:30 – 16:30 Final Panel