We sketch here the planned contents for 2024–2025.
These contents are structured around three important subdomains of
linguistics, (morphology, syntax, and semantics),
presenting on each occasion some of the related models and the
corresponding algorithmic issues. The exact dates and content might
change.
September 23rd, 2024 (
slides)
General Introduction Language has structure. Language and inference. The importance of ambiguity. Language and the world.
Linguistics basics for computational linguistics. Statistical properties of words, constituent and dependency analyses, computing semantic denotations and computing semantic similarities.
Machine learning basics for computational linguistics. Coding discrete symbols as vectors (word embeddings), optimisation reminder.
September 30th, 2024 (
slides)
Modelling sequences Presentation of typical problems involving sequence modelling.
Generative models language models, hidden markov models, PCFG
Discriminative models conditional random fields
Algorithms Viterbi and approximative methods
Deep learning based methods
October 14th, 2024
Modelling syntax (
slides)
Phrase structure grammar
Tree adjoining Grammar
Dependency syntax
Categorial grammar
October 21st, 2024
Parsing algorithms for natural language(
slides
CKY and Earley Introduction to weighted CKY and Earley
Shift Reduce and Eisner for Dependency syntax
CKY for tree adjoining grammar
October 28th, 2024
November 4th, 2024
November 18th, 2024
November 25st, 2024 Discourse Analysis discourse representation theory, anaphora resolution, type-theoretic dynamic logic
December 2nd, 2024
Project for the first half of the class: Link here