my face

I'm a first-year computer science PhD student at DIRO at the Université de Montréal, studying machine learning and NLP. I am advised by Bang Liu, and I'm currently thinking about worst-group generalization and extracting procedural knowledge from technical documents.

Previously I was a speech scientist at Cobalt Speech & Language, a company that designs custom speech recognition, text-to-speech, and dialogue models. While at Cobalt I worked on some neat projects, including language modeling for recognizing air traffic control speech and creating an online training system for ASR models.

I graduated from BYU with a BS in Applied and Computational Mathematics (ACME) with an emphasis in linguistics and a minor in computer science. ACME's rigorous curriculum includes graduate-level courses in algorithms, analysis, optimization, statistics, data science, optimal control, and machine learning.

During my undergrad I interned with Cobalt Speech, as well as Emergent Trading, an automated trading firm that made the news for reporting a problem in a Eurodollar exchange rule that unfairly favored larger competitors. (I developed the analysis tools that were used to track the issue down and determine how our opponent was taking advantage of the rule.)

contact (he/him) permalink

Around the web I'm known by the username kylrth. I prefer to be contacted through the Matrix protocol ( (If you'd like an account on my Matrix server, follow the instructions here.) My GPG public key is here.

Matrix  /  email  /  GitHub  /  LinkedIn  /  resume  /  phone  /  WhatsApp  /  Signal  /  Session

research permalink

I'm currently developing methods to use language models to extract procedural knowledge from technical documents (e.g. generating WikiHow from reference documents.) This is in a Mitacs collaboration with Thales Canada.

I'm also interested in how model size affects the performance of models on hard examples (worst-group generalization). It's currently unclear whether model scale hurts or helps performance on rare subgroups, but what we know seems to suggest that the answer depends on the optimization objective (ERM, IRM, etc.). This started as a class project and turned into a research interest; see my research notes here.

other projects permalink

WordBurner icon

In my spare time I'm working on an app for language learners to use to develop their vocabulary. Users can add words to their active list, and the app will recommend additional words that they might also want to learn. In addition to my personal use learning French, I hope this app will make it easier for displaced people to adapt to the culture where they find themselves, even when their language skills are already advanced.

the trapezoid used to picture vowels in phonology

speech2phone was a class project I worked on with two other ACME students our senior year. We tried to tackle the problem of phoneme recognition on the TIMIT corpus by splitting it into the tasks of segmentation and classification. We had some fun exploring various algorithms and architectures. We didn't yet have a good sense for the current state of the art research and so we sort of threw everything at the problem and watched what happened. You can see the repo here and the final report here.