Solidarity with Ukraine – 4EU+ for Ukraine II / Corpus Linguistics Workshop, 22-24.07.2024



In the end of July the Institute of English Studies of the University of Warsaw hosted the Corpus Linguistics Workshop. The event was one of the activities organised within the framework of the Project 4EU+ for Ukraine II financed by the National Agency for Academic Exchange of the Republic of Poland. Academics from Ivan Franko National University of Lviv, V. N. Karazin National University of Kharkiv and Taras Shevchenko National University of Kyiv had a chance to participate in a three-day programme, prepared by Dr Marcin Opacki and Dr Maciej Rosiński from the Institute of English Studies, UW.


22.07; Day 1 Foundations of Corpus Linguistics

10:00-11:30 Introduction to Corpus Linguistics – Definitions and key concepts

  • types of corpora (e.g., written, spoken, specialized)
  • real-world applications (e.g. educational research, sentiment analysis, authorship attribution, language models, etc.)

11:45-13:15 Corpus compilation,

  • Representativeness and sampling
  • Data preparation

13:15-14:15 lunch break

14:15-15:45 Corpus Query Tools and Techniques, software showcase and practical hands-on session

  • Basic querying, concordancing, regular expressions

19:00 Dinner at Informal Kitchen (Plac Małachowskiego 2)


23.07; Day 2 Advanced Topics and Practical Applications

10:00-11:30 Annotation and part-of-speech tagging; Methodological considerations

  • formal grammatical assumptions as the foundation of parsing
  • working with annotated vs. unannotated data

11:45-13:15 Statistical Analysis in Corpus Linguistics

  • Frequency distributions
  • Stylometrics, multi-word expressions, and collocations measures
  • Basic statistical analyses
  • Practical exercises with real-world data

13:15-14:15 lunch break

14:15-15:45 Software showcase and practical hands-on session


24.07; Day 3: Case Studies in Semantics and Discourse Analysis

10:00-11:30 Selected concepts in semantics

Analyzing synonymy, antonymy, metonymy and metaphor with a corpus linguistics toolkit

11:45-13:15 Establishing protocols for identification and categorization of semantic phenomena

13:15-14:15 lunch break

14:15-15:45 Corpus linguistics and discourse analysis

  • Building thematic corpora
  • Keyword analysis
  • Framing


Didactic materials:



Testimonials of one of the participants, Ms Natalya Oliynyk, on LinkedIn and Facebook.


Back to Events