Fri 05 Dec 12:00: Building and Understanding Human-scale Language Models
Nov. 29th, 2025 09:40 am
Building and Understanding Human-scale Language Models
Abstract: Humans learn language from less than 100 million words. Today’s state-of-the-art language models are exposed to trillions of words. What do today’s human-scale language models learn—and what don’t they? How can we close this gap in data efficiency? In this talk, I will start by presenting insights from 3 years of the BabyLM Challenge. The purpose of BabyLM is to encourage researchers to train language models using only as much data as a human would need when first learning language, and to democratize access to language modeling research. Participants have submitted a wide variety of systems; the most highly performing systems tend to come from innovations to the architecture of training objective. Then, I will present recent work on the training dynamics of both human-scale and large-scale language models. I will present a method for understanding what concepts a model is learning at specific points in training. Using subject-verb agreement as a case study, I will show that simpler word-matching features are learned early in training, while more abstract grammatical number detectors—including more abstract cross-linguistic number features—are learned far later in training. I will conclude by discussing the future of BabyLM, and the future of interpretability as a tool for understanding—and improving—language model training.
Bio: Aaron Mueller is an Assistant Professor (Lecturer) of Computer Science (Informatics) and, by courtesy, of Data Science at Boston University. His research centers on developing language modeling methods and evaluations inspired by causal and linguistic principles, and applying these to precisely control and improve the generalization of computational models of language. He completed his Ph.D. at Johns Hopkins University. His work has been published in ML and NLP venues (such as ICML , ACL, and EMNLP ) and has won awards at TMLR and ACL . He is a recurring organizer of the BlackboxNLP and BabyLM workshops.
- Speaker: Aaron Mueller (Boston University)
- Friday 05 December 2025, 12:00-13:00
- Venue: ONLINE ONLY. Google Meet Link: https://meet.google.com/yeu-pqce-rsn...
- Series: NLIP Seminar Series; organiser: Suchir Salhan.