papers

These are my notes from research papers I read. Each page’s title is also a link to the abstract or PDF.

QA-GNN: reasoning with language models and knowledge graphs for question answering

This post was created as an assignment in Bang Liu’s IFT6289 course in winter 2022. The structure of the post follows the structure of the assignment: summarization followed by my own comments. The authors create a novel system for combining an LM and a knowledge graph by performing reasoning over a joint graph produced by the LM and the KG, thus solving the problem of irrelevant entities appearing in the knowledge graph and unifying the representations across the LM and KG.
Read more

Experienced well-being rises with income, even above $75,000 per year

Turns out that money does buy happiness. You may have heard that people’s average happiness stops improving once you make more than $75,000/year? Researchers did a better survey with more data and found that that was not the case. The researchers cited 5 methodological improvements over the old research that suggested that it didn’t matter after $75,000: They measured people’s happiness in real time, instead of having people try to remember past happiness levels. They measured on a continuous scale instead of discrete. (The survey had a slider between “very bad” and “very good”, instead of discrete options to choose from.) They measured on “dozens of separate occasions per person” using an app, instead of just a single questionnaire. They used a comparable scale for experienced well-being (people’s evaluations of their lives) and evaluative well-being (how people feel during the day-to-day moments of their lives). They included a large number of high-earning participants and measured higher incomes in more granular increments. The data were collected from 33,391 employed adults (ages 18-65) in the United States. There were a total of 1,725,994 reports collected, so that’s ~52 reports per person.
Read more

Neural message passing for quantum chemistry

This post was created as an assignment in Bang Liu’s IFT6289 course in winter 2022. The structure of the post follows the structure of the assignment: summarization followed by my own comments. To summarize, the authors create a unifying framework for describing message-passing neural networks, which they apply to the problem of predicting the structural properties of chemical compounds in the QM9 dataset. paper summarization permalink The authors first demonstrate that many of the recent works applying neural nets to this problem can fit into a message-passing neural network (MPNN) framework. Under the MPNN framework, at each time step \(t\) a message is computed for each vertex by summing the output of a learned function \(M_t\) over the vertex and all edges and vertices connected to it. Then the next state for each vertex is a learned function \(U_t\) of the previous state and the message. Finally, the “readout” function \(R\) is applied to all the vertices to compute the result.
Read more

The effect of model size on worst-group generalization

This was a paper we presented about in Irina Rish’s neural scaling laws course (IFT6167) in winter 2022. You can view the slides we used here, and the recording here.

Scaling laws for the few-shot adaptation of pre-trained image classifiers

The unsurprising result here is that few-shot performance scales predictably with pre-training dataset size under traditional fine-tuning, matching network, and prototypical network approaches. The interesting result is that the exponents of these three approaches were substantially different (see Table 1 in the paper), which says to me that the few-shot inference approach matters a lot. The surprising result was that while more training on the “non-natural” Omniglot dataset did not improve few-shot accuracy on other datasets, training on “natural” datasets did improve accuracy on few-shot Omniglot.
Read more