deep-learning
This post was created as an assignment in Bang Liu’s IFT6289 course in winter 2022. The structure of the post follows the structure of the assignment: summarization followed by my own comments.
paper summarization permalink The authors use the example of distinguishing between a Samoyed and a white wolf to talk about the importance of learning to rely on very small variations while ignoring others. While shallow classifiers must rely on human-crafted features which are expensive to build and always imperfect, deep classifiers are expected to learn their own features by applying a “general-purpose learning procedure” to learn the features and the classification layer from the data simultaneously.
Read moreAvatarify is a cool project that lets you create a relatively realistic avatar that you can use during video meetings. It works by creating a fake video input device and passing your video input through a neural network in PyTorch. My laptop doesn’t have a GPU, so I used the server/client setup.
setting up the server permalink Be sure you’ve installed the Nvidia Docker runtime so that the Docker container can use the GPU. You can see how I did that here. Run the following on the server:
Read moreThis is a long paper, so a lot of my writing here is an attempt to condense the discussion. I’ve taken the liberty to pull exact phrases and structure from the paper without explicitly using quotes.
Our main hypothesis is that deep learning succeeded in part because of a set of inductive biases, but that additional ones should be added in order to go from good in-distribution generalization in highly supervised learning tasks (or where strong and dense rewards are available), such as object recognition in images, to strong out-of-distribution generalization and transfer learning to new tasks with low sample complexity.
Read moreThis paper builds on what we learned in “Understanding deep learning requires rethinking generalization”. In that paper they showed that DNNs are able to fit pure noise in the same amount of time as it can fit real data, which means that our optimization algorithm (SGD, Adam, etc.) is not what’s keeping DNNs from overfitting.
experiments for detecting easy/hard samples permalink It looks like there are qualitative differences between a DNN that has memorized some data and a DNN that has seen real data. In experiments they found that real datasets contain “easy examples” that are more quickly learned than the hard examples. This is not the case for random data.
Read more