These are my notes from research papers I read. Each page’s title is also a link to the abstract or PDF.

Artificial intelligence, values, and alignment

I presented this paper in Bang Liu’s research group meeting in two installments on 2023-02-20 and 2023-02-27, and also in Irina Rish’s scaling and alignment course (IFT6760A) on 2023-03-07. You can view the slides I used here.The thumbnail for this post was generated with stable diffusion! See the alt text for details. Behind each vision for ethically-aligned AI sits a deeper question. How are we to decide which principles or objectives to encode in AI—and who has the right to make these decisions—given that we live in a pluralistic world that is full of competing conceptions of value?
Read more

Unsolved problems in ML safety

This was a paper we presented about in Irina Rish’s neural scaling laws course (IFT6760A) in winter 2023. You can view the slides we used here, and the recording here (or my backup here).

AlphaTensor: discovering faster matrix multiplication algorithms with reinforcement learning

This was a paper I presented about in Bang Liu’s research group meeting on 2022-10-21. You can view the slides I used here.

Whisper: robust speech recognition via large-scale weak supervision

This was a paper I presented about in Bang Liu’s research group meeting on 2022-09-30. You can view the slides I used here.

Selective annotation makes language models better few-shot learners

Selective annotation chooses a pool of samples to annotate from a large set of unlabeled data. The main result of the paper is that when this is combined with item-specific prompt retrieval the performance drastically improves (>10% relative gain and lower performance variance). Interestingly, selective annotation does not help for finetuning, or when the prompts are randomly selected. They call their selective annotation method “vote-\(k\)”. selective annotation method permalink Vote-\(k\) essentially creates a network of similaraccording to Sentence-BERT unlabeled instances, and then selects from them with a network importance score that is discounted to promote diversityThe discounting is performed by iteratively adding to the selection set, each time penalizing new nodes for being close to nodes that are already in the selection set.
Read more