Archiv Konferenzen & Workshops

© 2016 The R Foundation (CC-BY-SA 4.0)

Polish Language(s) and Digital Humanities Using R

Block Seminar - Four Fridays in February 2020 with Garik Moroz (Moscow) in Naumburg

The Aleksander Brückner Centre for Polish Studies Jena invites researchers and students to a multi-part seminar in Naumburg with DAAD fellow Garik Moroz (Linguistic convergence lab, Higher School of Economics, Moscow).

In this course we will discover basics of R, data visualization and manipulation and discuss some Natural Language Processing topics like work with strings, texts and research procedures (e.g., stylometry). We will focus on an application of these tools to Polish, especially dialects of Polish. Projects concerning other Slavic languages are also welcome.

Any background in R or programming is welcome but not obligatory. All classes are a combination of lecture and application of tools and methods used for problem solving. Participants should bring their own PC (if possible with R and RStudio already installed).

The course is project driven: participants will work on a specific data set and project which will ideally be based on a real research question. Rather than trying to cover all possible topics in a single course, the course is organized around consultations and additional lectures on particular topics as they are necessary for the projects.

The course will end with a hackathon. During the hackathon participants use the acquired skills and knowledge to work on their projects and present their results during a final presentation.

When? Fridays: 7/2 & 14/2 & 21/2 - always from 10am to 4pm (including lunch break) + 28/2 (Hackathon and short presentation)

Where? Domstift Naumburg, Seminar room Petrus / Seminar room Paulus, Südklausur of the Naumburger Dom, Domplatz 16/17, 06618 Naumburg

The course is open to all interested students or researchers in the Unibund region of Leipzig, Halle and Jena (and beyond). You can assign 5 ECTS for this workshop in your studies.

If you are interested to take part or have any questions, please write to Helena Link ( from the Aleksander-Brückner-Zentrum in Jena in Polish, English or German and include a short sketch of yourself, your study interests and possible project ideas. Places are assigned on a first-come-first-serve basis, so early registration is recommended!


Tentative plan

First day (7 February): (15 minutes) Meet and quick discussion about your projects (3 h.) Introduction to R and RStudio (3 h.) Data manipulation: dplyr, tidyr

Second day (14 February): (3 h.) Data visualisation: ggplot2 (3 h.) Working with strings: stringr

Third day (21 February): (3 h.) Working with texts: gutenbergr, tidytext, udpipe (3 h.) Authorship detection: stylo

Final day (28 February): HACKATHON (day of intensive work in groups with instructors help) 9:30 -- 12:00 start of work 12:00 -- 13:00 lunch 13:00 -- 15:00 continuation of work 16:00 -- 17:00 preparation of presentations 16:00 -- 17:00 project presentations