ILLC Seminar | Andrey Kutuzov: What can go wrong with pre-trained language models for semantic change detection

november 8 @ 4:00 pm - 5:30 pm

Organized by the Institute for Logic, Language and Computation! (ILLC)

  • Speaker: Andrey Kutuzov
  • Location: Amsterdam Science Park + Live streaming via Zoom!
  • Room: Room L3.36 at LAB42
  • Title: ‘What can go wrong with pre-trained language models for semantic change detection’ 
  • Abstract:

Large-scale contextualized language models are currently often used for semantic change detection: both LSTMs and Transformers. The results are impressive, but are LM-based systems always correct? In this talk, I will qualitatively analyze questionable outputs of such systems on the example of the degrees of semantic change predicted for English words across 5 decades.

It seems that LM-based methods can often predict high change scores for words which are not undergoing any real diachronic semantic shift in the lexicographic sense of the term. Pre-trained language models are prone to confound changes in lexicographic senses and changes in contextual variance. Notably, this is different from the types of issues observed in methods based on static word2vec-like embeddings. Additionally, contextualized language models often merge together syntactic and semantic aspects of lexical entities. I will discuss such cases in detail, complete with examples, an attempt of their linguistic categorization, and a range of possible future solutions.


november 8
4:00 pm - 5:30 pm
Evenement Categorie:


Amsterdam Science Park
Science Park 904
Amsterdam, Noord-Holland 1098 XH Nederland
