Continual-T0: progressively instructing 50+ tasks to language models without forgetting

This was a paper I presented about in Bang Liu’s research group meeting on 2022-06-06. You can view the slides I used here. Continual-T0 (CT0) extends T0 by progressively training it on 8 unseen language generation tasks, while retaining a replay buffer of 1% of the original training data to preserve performance. The result is a model that maintains nearly all of its performance on previous tasks while learning the new tasks.
Read more

Multitask prompted training enables zero-shot task generalization (T0)

T0 builds on T5 by fine-tuning on more natural prompts and testing the model’s generalization to held-out tasks. Compare the training format diagrams for T5 (top) and T0 (bottom): Intuitively, the T0 prompts are more likely to be similar to implicit/explicit prompting that’s present in the pretraining data. The authors created several prompts for each dataset. results permalink Our experiments study two questions. First, does multitask prompted training improve generalization to held-out tasks?
Read more

WordBurner beta

Update 2022-04-27: The beta is over, but the apk is still installable with the instructions below and any feedback sent from inside the app will be received by me. I’m going to be working on this more over the summer, and eventually publishing it on the app store. :) Ever since learning Spanish, it has been a dream of mine to create a vocabulary study app that meets my needs. Duolingo won’t cover advanced vocabulary, Anki requires manually-generated decks, and other apps have expensive subscription plans.
Read more


This was a paper I presented about in Bang Liu’s research group meeting on 2022-04-11. You can view the slides I used here.

It's not just size that matters: small language models are also few-shot learners

We presented this paper as a mini-lecture in Bang Liu’s IFT6289 course in winter 2022. You can view the slides we used here.