Teaching assistant for Large Language Models
Graduate course, ETH Zurich, Department of Computer Science, 2025
- Prepared and presented tutorials for >300 students in the Machine Perception course. The course involves a wide-range of LLM related topics from theoretical analysis of architectures such as RNNs and transformers, evolving methods such as instruction-finetuning and RLHF, and weak points of LLMs such as prompt injection and model hijacking.
Course Overview
Large language models have become one of the most commonly deployed NLP inventions. In the past half-decade, their integration into core natural language processing tools has dramatically increased the performance of such tools, and they have entered the public discourse surrounding artificial intelligence. In this course, we start with the probabilistic foundations of language models, i.e., covering what constitutes a language model from a formal, theoretical perspective. We then discuss how to construct and curate training corpora, and introduce many of the neural-network architectures often used to instantiate language models at scale. The course discusses privacy and harms, as well as applications of language models in NLP and beyond.