nlp

InstructEval: systematic evaluation of instruction selection methods

This was a paper I presented about in Bang Liu’s research group meeting on 2023-09-25. You can view the slides I used here.

"Low-resource" text classification: a parameter-free classification method with compressors

I presented this paper in Bang Liu’s research group meeting on 2023-07-24. You can view the slides I used here. It seems like the authors made a mistake that inflated the scores for the multilingual experiments, according to Ken Schutte.

Whisper: robust speech recognition via large-scale weak supervision

This was a paper I presented about in Bang Liu’s research group meeting on 2022-09-30. You can view the slides I used here.

Selective annotation makes language models better few-shot learners

Selective annotation chooses a pool of samples to annotate from a large set of unlabeled data. The main result of the paper is that when this is combined with item-specific prompt retrieval the performance drastically improves (>10% relative gain and lower performance variance). Interestingly, selective annotation does not help for finetuning, or when the prompts are randomly selected. They call their selective annotation method “vote-\(k\)”. selective annotation method permalink Vote-\(k\) essentially creates a network of similaraccording to Sentence-BERT unlabeled instances, and then selects from them with a network importance score that is discounted to promote diversityThe discounting is performed by iteratively adding to the selection set, each time penalizing new nodes for being close to nodes that are already in the selection set. .
Read more

Continual-T0: progressively instructing 50+ tasks to language models without forgetting

This was a paper I presented about in Bang Liu’s research group meeting on 2022-06-06. You can view the slides I used here. Continual-T0 (CT0) extends T0 by progressively training it on 8 unseen language generation tasks, while retaining a replay buffer of 1% of the original training data to preserve performance. The result is a model that maintains nearly all of its performance on previous tasks while learning the new tasks. In addition, CT0 maintains the original T0’s performance on unseen tasks (which is a big deal because those tasks could not appear in the replay buffer) and it extends the compositionality of T0 to even more unseen tasks.
Read more